Saturday, December 31, 2011

RavenDB on Linux - Source Code

I've created a new github repository to temporarily host the updated code for RavenDB to enable execution under Linux. I say temporary because a few things could happen.

Worst case scenario, and I don't believe this would happen, but there is a slight possibility, that I might be asked to shut down the repository because I unknowingly violated some terms of use. For an open source project, this is a highly unlikely scenario.

The normal case scenario would be that I get no notice from the creators of RavenDB in which case this code would continue to exist under its own repository.

The bese case scenario, and I would really like this to be the case, would be that my changes in some shape be accommodated upstream in RavenDB making RavenDB a cross platform tool.

I did not investigate why the OutputStream.Flush() command was causing an exception. At the same time, this is really my first attempt at MEF and .Net 4.0 and I don't know why the exports were not automagically loaded, in resolution to which, I had to manually load them using reflections. A better fix would be to identify and resolve these issues.

I am glad, however, that I was able to fulfill a personal quest of learning about RavenDB, and in the process, making it run under Linux. This opens up the possibility of making RavenDB a serious contender against MongoDB on the non-Windows platforms.

RavenDB along with my source code changes are available at https://github.com/jimmy00784/RavenDB-for-Linux https://github.com/jimmy00784/ravendb.

Note: Source code url updated.


This article is part of the series NoSQL - RavenDB on Linux. The series contains the following articles:
NoSQL - RavenDB on Linux
Open Source Shines - RavenDB on Linux
RavenDB on Linux - Source Code
RavenDB on Linux - Update

Open Source Shines - RavenDB on Linux

In the previous article, I wrote about RavenDB - a .Net NoSQL Document Oriented Database that supports Transactions - and why I had decided to spend time on it to get it to run under Linux. Now RavenDB was created for a Windows only audience and the creator's website doesn't not provide any guidelines about how to implement it under Linux.

After numerous attempts and with some modifications to the source code, I was able to get it to run under Linux. Some features that are not available under Linux as they are under Windows such as the web based UI - Raven Studio . However, dynamic index functionality also did not seem to work under Linux for some reason.

In my initial review had failed to notice something. When the server application starts at command line, you have a few options such as garbage collect, clear screen, and reset. When I had fired the test application that I had created, I noticed and then did not pay any attention to the fact that the server was not sending responses to all of the calls that were being sent by the client. When this happened, the client froze until either I issued a reset command to the server command line, or the connection timed out eventually. Since reset seemed to get the ball rolling for the moment, I was not too much concerned at that time.

The couple days ago, I decided to spend time to finish the task that I had started - to tweak RavenDB's code enough to get the database to fully work under Linux and then to use it in a project. My many efforts to identify the root cause of the issue were not at all fruitful. My first instinct was to review the server log files. There was nothing useful to be found there.

I thought that may be the document Id generator was causing the hold up so I provided manually generated Ids to my document objects. When that didn't solve the issue I started placing breakpoints all over the client library code and debugged the application one line at a time, stepping into every function call possible. I came across a piece of code that seemed like a promising lead.

It was in the Raven.Client.Lightweight project in the Connections/HttpJsonRequest.cs file at line 304.

var text = reader.ReadToEnd();

When control reached that statement, the execution froze until I issued the reset command at the server command line. It meant only one thing - there had to be a corresponding response writing activity on the server. I reviewed the code couple lines above to get some clues. I evaluated response object on line 299 for its ResponseUri property.

ResponseHeaders = response.Headers;

The ResponseUri property was set to "http://localhost:8080/bulk_docs".

At the same time the server had the following output on the terminal:


That was my clue. I needed to locate a responder that handled the "bulk_docs" url.
Raven.Database project has a responder that handles "bulk_docs" under Server/Responders/DocumentBatch.cs. On setting breakpoint in the file on line 38 and following the control line by line lead me to Server/HttpServer.cs on line 318 which always ended in exception with a not so meaningful error message and once this method was executed the control stop.


I debugged further into the FinalizeRequestProcessing which led me to line 351.

ctx.FinalizeResonse();

This took me to Server/Abstractions/HttpListenerContextAdapter.cs line 82. The control disappeared once I tried to step further than like 86.

ResponseInternal.OutputStream.Flush();

I put a breakpoint in the catch block and sure enough there was an exception - and I/O exception of some sort - which was not handled. Couple lines below was the smoking gun. Line 88 never executed in event of an exception and that would be what was causing the server to hang on to request.

ctx.Response.Close();

I made a slight tweak to this file and moved the two lines into the finally block.


I recompiled the server and there it was. RavenDB was not longer freezing on individual requests and dynamic indices were working as designed. RavenDB on Linux is now ready to be used in application development. Since RavenDB was created as an Open Source application, it was possible to review the code and troubleshoot this issue. The issue seems to not affect Windows as RavenDB is being used by real world businesses.

Next thing to do would be to create a non WPF replacement for Raven Studio for use in Linux.


This article is part of the series NoSQL - RavenDB on Linux. The series contains the following articles:
NoSQL - RavenDB on Linux
Open Source Shines - RavenDB on Linux
RavenDB on Linux - Source Code
RavenDB on Linux - Update

Saturday, December 24, 2011

NoSQL - RavenDB on Linux

For a while now I've been on a quest to find a NoSQL database that met the following criteria:
  1. Document Oriented Database
  2. Supports Transactions across multiple Documents
  3. .Net/Mono compatible drivers
  4. Runs under Linux
And the journey has been anything but easy.

Ever since I first read about Document Oriented NoSQL Databases, I've been fasinated by them. I started looking into CouchDB [apache.org] at first, but MongoDB [mongodb.org] soon became my favorite.
MongoDB is a Document Oriented Database that is written in C++ with performance in mind. MongoDB allowed my to write applications using my favorite languages and ran on many Operating Systems. After doing some research, I also found a port of MongoDB for ARM. It now runs on a Debian server that I have setup on PogoPlug which runs on an arm chipset. The JSON style BSON representation of data was simple to follow and the ability store any object in MongoDB without having to implement special interfaces or inheirit from special classes made it that much more appealing to me. The .Net drivers were readily available on their website along with plenty of documentation on how to use them.

MongoDB met three out of the three criteria I had above. The was one that it didn't satisfy: MongoDB doesn't yet support Transactions [mongodb.org] and there is no word out yet that would promise availability of transactions any time soon.

I continued my research and came across RavenDB [ravendb.net]. RavenDB is a Document Oriented Database that is written completely in .Net. It is also Mono compatible. Unlike MongoDB, RavenDB does support transactions across multiple documents, however, it only runs under Windows. Though evn RavenDB doesn't meet all the four criteria, since it was written completely in .Net, that gives me something to tinker around and see if it could be made to run under Linux.

My first attempt was a few months ago when I first read about RavenDB. I downloaded the code from their GitHub git repository and fired up MonoDevelop [monodevelop.com]. The issues became immediately apparent: heavy reliance on Silverlight. But even then, there was .Net 4.0 code that my mono compiler could not make sense of. After spending few minutes over it, I gave up on it.
Few days ago, with the update of Ubuntu, I got the near latest version of Mono installed on my laptop. I decided to give RavenDB another try.

From my konsole, I issue the command to get the latest code for RavenDB and created a new branch - linux.

git clone git://github.com/ravendb/ravendb.git
cd ravendb
git branch linux
git checkout linux

As soon as I fired up MonoDevelop and loaded the solution, I was greeted with this error message:
Also, the projects Raven.Backup and Raven.Smuggler were not set to build under my current configuration. I noticed that upon Right Click -> Options -> Build -> Configuration on the Raven.Backup project, the only configurations that were available were for x86. I was able to select Debug, click Copy button and create another configuration for All CPU. I repeated this for Release and then the same steps for the other project as well.
Then from the Right Click -> Options -> Build -> Configurations -> Configuration Mapping tab, I selected the correct configuration for the two projects. now the projects were no longer marked as "unable to build under current configuration." Now, on to the issue with the three projects failing to load: Raven.Client.Silverlight, Raven.Studio, Raven.Tests.Silverlight.
While anything Silverlight under Linux (using Moonlight) was not a promising prospect to begin with, I still wanted to give it a shot before beginning the removal of non-compilable projects. I opened the project files in plain text format and searched for the GUIDs from the screen print above. On finding them, I commented them and attempted to reload the projects. The solution was now ready for the first compile.

First attempt:
95 build errors. I also noticed that at least a few of those errors were really warnings. This meant one thing - some or all the projects were setup to treat warnings as errors. That was an easy fix. On all the projects, I unchecked the "Treat warnings as errors" box from Right click -> Options -> Build -> Compiler screen. Time for the second compile attempt.


Second attempt:
100 build errors. One of the first errors was in the Raven.Client.Debug project on classes from the Microsoft.VisualStudio.DebuggerVisualizers namespace. Dependency on Visual Studio would be a problem under Linux. I decided to not dwell too much on this error and chose to remove this project all together from the solution. All the dependencies on Raven.Client.Debug would also have to be removed.


Third attempt:
95 build errors. This time it was the NLog namespace under the Raven.Tryouts project. Reviewing the error revealed that NLog dll was not compatible with the current Mono runtime. Fortunately, NLog had a Mono compatible binary available for download on Codeplex. I downloaded it and replaced the reference in the project.


Forth attempt:
94 build errors. Upon closer inspection, I observed that Raven.Web relied on System.Web.Entity which is not available under Mono. Also, Raven.Client.Silverlight System.Windows and System.Windows.Browser which are also not included with Mono. I decided to remove these two project from compilation along with Raven.Tests.Silverlight project. Raven.Studio project was the next one to go since it was a WPF application - lots of XAML files - and WPF is not fully implemented under Mono.


Fifth attempt:
0 build errors. Are we there yet? Let's give it a try. I ran the Raven.Server project. Bummer! Runtime exception. DllNotFoundException. Turns out that Raven.Storage.Esent project implements Microsoft's ISAM Esent storage which is proprietary and requires Windows in order to run. Since we also had Raven.Storage.Managed project, I decided to modify the configuration of the application to use the managed storage library Munin instead of UnManaged Esent. I modified the App.Config under Raven.Server and changed the value for "Raven/StorageEngine" to "Munin" from "Esent".

Sixth attempt:
0 build errors (expected). Yet another runtime error however. This time it was a MissingManifestResourceException with a very vague description and a long stack trace. However, right before the stack trace took me into obscurity, there was a hint - ravendb/Raven.Database/Server/HttpServer.cs:110. The SatisfyImportsOnce call was failing after some execution. After doing some reading into it, I found out that RavenDB was build on MEF which is a feature of .Net 4.0 and is supposed to provide easy plugin/extension functionality to .Net programs. That line was supposed to initialize the extensions once. Somewhere in the code below there would be a code block that would expect non null values. I commented that line and proceeded.

Seventh attempt:
0 build errors. As expected, NullReferenceException on Raven.Database/Server/HttpServer.cs:112. RequestResponders was supposed to be non-null. I did some more research to understand how MEF was supposed to implement the extensibility. I had to find classes that were inheirited from AbstractRequestResponder. May be manually loading objects into RequestResponders would help. And indeed I found plenty of classes that were inheirited from AbstractRequestResponder and RequestResponder classes. I added the following lines of code in HttpServer.cs


Eight attempt:
0 build errors. Another exception, this time NullReferenceException at Server/HttpServer.cs:193. This seemed similar to the previous issue but with a different object - ConfigureHttpListeners. Time to repeat the exersice with different base class - interface this time.


Ninth attempt:
0 build errors. I started seeing some output repeating and they didn't look like error messages:

Available commands: cls, reset, gc, q
Could not understand:

I decided to run the compiled application from konsole to see how it would behave there.

Raven is ready to process requests. Build 13, Version 1.0.0.0 / abcdef0
Server started in 847 ms
Data directory: /home/karim/Projects/RavenDB/ravendb/Raven.Server/bin/Debug/Data
HostName: <any> Port: 8080, Storage: Munin
Server Url: http://karim-laptop:8080/
Available commands: cls, reset, gc, q

Time to run some tests. I created a small console .Net application and decided to fire it up. It worked! Some issues did surface, but nonetheless, it worked. I was able to store and retrieve simple documents to the database. Since I had removed some of the critical projects such as Raven.Web and Raven.Studio, the web interface was gone with them. Dynamic indices did not work either.
Now that I have a semi functional database, I'll put some time in and try to create a simple Raven Studio replacement as well as work the other kinks out.

Overall, it helps that the application was written completely in .Net without any dependencies on the OS that could not be achieved via framework level abstraction. I am certain that RavenDB could be made to work under Linux at similar level of confidence as it does under Windows.

This article is part of the series NoSQL - RavenDB on Linux. The series contains the following articles:
NoSQL - RavenDB on Linux
Open Source Shines - RavenDB on Linux
RavenDB on Linux - Source Code
RavenDB on Linux - Update

Monday, December 19, 2011

CallCentric - VoIP Service Provider


CallCentric
CallCentric offers variety of VoIP products for personal, residential, as well as office use. If you need to switch your traditional phone lines to SIP save money on phone bills, or if you just need additional phone numbers besides your traditional phone lines, CallCentric has products offerings that rival those from their competitors.
CallCentric numbers can be very easily:
  1. configured on the IP Phones on your network
  2. forwarded to traditional phones
  3. routed to your internal PBX, like FreeSWITCH
Call forwarding from CallCentric to SIP address is free, so are calls from one CallCentric number to another. Local numbers are available. You can even transfer your existing number to CallCentric. Signup for a free account here. Additional paid features can be added at any time.

Filesystem Encryption under Linux - EncFS

Customary disclaimer: The author takes absolutely no responsibility for any corruption or loss of data, especially those that might result out of application error, user error, or user negligence. Please also check your local laws pertaining to sharing encrypted data as it might be illegal in some countries. The author is not a lawyer and the contents of the article should not be perceived as legal advice under any circumstance. Please use at your own discretion.

In the previous article, I gave a quick introduction about FUSE and how ClamFS could be used to secure filesystem against malware. Another area where FUSE shines is in the field of encrypted filesystems. There are quite a few options available when it comes to setting up encrypted storage for safe keeping of important content. In this article, I'll discuss one such encrypted storage option that is built using FUSE functionality: EncFS.

EncFS is a simple to setup virtual filesystem that provides portable encrypted storage for your data. It is a pass-through filesystem, which means that while the files themselves exist on the filesystem encrypted, with EncFS loaded, they appear decrypted. Unlike many other encrypted filesystems which require you to dedicate entire partitions for encrypted storage, EncFS exists on top of existing filesystems. You don't need to pre-allocate fixed amounts of space for it, hence, there is no wastage of storage.

You can install EncFS easily from your software package manager. On Debian based distributions it is as simple as typing the following on the terminal window and providing the sudo user's password when prompted (exclude the $ sign):

$ sudo apt-get install encfs

Once it is installed, you can create your encrypted storage and a decrypted mount point with the following command:

$ encfs path-to-encrypted-storage path-to-decrypted-mount-point

Example:

$ encfs /tmp/encrypted /tmp/decrypted

You will be prompted if the folders should be created if they don't already exist. You will also be prompted to chose the level of encryption. Your choices are simple, paranoid, and expert.

Please choose from one of the following options:
enter "x" for expert configuration mode,
enter "p" for pre-configured paranoia mode,
anything else, or an empty line will select standard mode.

Once you have provided your choice, you will be prompted to enter and confirm a password to enable access to the encrypted storage. Do not forget this password or you will lose your encrypted data. You can now start using your encrypted storage. Anything that is saved under the decrypted mount point is actually encrypted and saved under the encrypted storage. The filenames as well as the content of the files will be encrypted. Do not save data directly under encrypted storage as data written directly there does not get encrypted.

Once you are done securing your data, you can unmount the decrypted mount point by using the fusermount command as follows:

$ fusermount -u path-to-decrypted-mount-point

To remount the encrypted storage, just reissue the encfs command and provide your password:

$ encfs path-to-encrypted-storage path-to-decrypted-mount-point


Conclusion

EncFS' simplicity to use makes it an ideal tool to help provide ad-hoc encryption capability on existing filesystems without having to spend any money on additional hardware. EncFS is also very flexible and portable, and allows you to store encrypted data along side with unencrypted data. It is however necessary to point out that, like any other encryption scenario, it is important to maintain secure unencrypted backup copies of critical data for emergencies like filesystem and hardware failures etc.


You may also like the series - ClamAV - Antivirus for Linux:
ClamAV - Antivirus for Linux
ClamFS - Antivirus Filesystem for Linux

Friday, December 16, 2011

ClamFS - Antivirus Filesystem for Linux

Disclaimer: Flying through Linux and OpenSource. You might experience Freedom along with plenty of awesomeness.

Linux as never failed to amaze me with its simplicity and its feature richness. That was one of the few reasons that convinced me to switch to Linux on my computers as the primary Operating System. One such feature is the ability to load virtual filesystems in userspace using the FUSE module. When enabled, the custom FUSE module would allow you to load custom data stores that could be traditional and non-traditional file systems or custom programs that allow filesystem like interaction. You could then interact with these custom userspace filesystems as though they were like any other filesystem. Depending on the fuse module, you could browse the folders and create, modify, and delete data as though they were regular files on your computer.

Two of the many real world applications of FUSE are in the following areas:
  1. Antivirus filesystems - Creates virtual filesystems that trigger antivirus scanning whenever files within the filesystems are written to or read from.

  2. Encrypted filesystems - Creates virtual filesystems that automatically encrypt files on write operations and decrypt them on reads.
In this article, I'll introduce ClamFS, which lets you create a virtual antivirus filesystem on top of your existing filesystem.

ClamFS triggers automatic scanning of files using ClamAV whenever I/O is performed on them. The best candidate for ClamFS would be the default downloads folder where you Internet browser saves those files that are downloaded from the Internet. You would ideally start with an empty Downloads folder, however, ClamFS also allows you to secure folders that already have contents in them.

ClamFS could be easily installed from your Linux distributions software package manager. Just search for clamfs. For each virtual antivirus filesystem that you'd like to create, you will need an XML configuration file. The configuration file will let you define the following:
  1. What folder to secure, where to mount the virtual filesystem
  2. What maximum file size to scan - to increase performance
  3. Whitelist extensions - files with these extensions will never be scanned
  4. Blacklist extensions - files with these extensions will always be scanned regardless of maximum file size parameter
  5. Logging method - standard out, syslog, file, or email
In the event that you attempt to download or write an infected file to the ClamFS virtual filesystem, an error entry will be logged and the file will not be successfully written to your filesystem. If an infected file already exists on the ClamFS virtual filesystem, you will not be allowed to read from it.

Here is an example configuration: clamfs.xml

The command to enable ClamFS protected virtual filesystem is:

$ sudo clamfs configuration.xml

You can find more information about ClamFS and on the configuration XML on their website.


This article is part of the series ClamAV - Antivirus for Linux:

You may also like:

ClamAV - Antivirus for Linux

Disclaimer: Flying through Linux and OpenSource. You might experience Freedom along with plenty of awesomeness.


In today's digital world, where access to electronic content is so convenient that it is almost taken for granted, there is a constant threat of malware infection.
While Linux computers a much less likely to be infected by a malware transmitted via sharing files via removable media or over the Internet, it is certainly not impossible. And while you yourself may not be too much concerned about getting your computer infected because of an infected file, you may unknowingly put the non-Linux users with whom you may share files at risk. These could be your family, friends, and colleagues, etc.

There are a few steps that one can take to prevent and minimize getting and spreading such infections and the most common one happens to be installing an antivirus software. Many antivirus software vendors today offer some paid as well as free versions for Linux computers. When I was a Window user, one of the first things I did after (re)installing the operating system on my computer was to install an antivirus on it. I hated using antivirus, especially the real-time scan features, since they slowed my computer, negatively affecting the overall experience. Ever since I made the switch to Linux few years ago I've not had that problem and I've gotten used to using my computers without any antivirus software installed. I changed my mind few weeks ago when I realized that malware were no longer restricted to the standard executable files. Malware authors are now exploiting vulnerabilities in popular software like Adobe PDF Reader and Flash Player etc. and packaging malware in pdf, swf, and other files. While Linux users should still be relatively safe from such infections, non-Linux users, especially Windows users are not. Therefore, in order to stop the malware from spreading out from your computer, it is essential that they be detected there first.

ClamAV is a popular open source antivirus that has been around for many years. Its website describes it as follows.
ClamAV is an open source (GPL) antivirus engine designed for detecting Trojans, viruses, malware and other malicious threats. It is the de facto standard for mail gateway scanning. It provides a high performance mutli-threaded scanning daemon, command line utilities for on demand file scanning, and an intelligent tool for automatic signature updates.
It is available on many popular Linux distributions. On Ubuntu, it can be installed using the Software Center or any Package Manager. Just search for clamav. You will also need to install the virus definitions updater for ClamAV which is known as freshclam. Keep in mind that ClamAV does not install a graphical user interface and once you've installed clamav and freshclam, they would be only usable via the command line shell. There are however a few GUI tools available that can be used with clamav and one such tool is clam tk which could also be installed from the software manager. Once installed, clamtk will allow you to scan files and directories using clamav. It will also allow you to configure clamav using its Advanced -> Preferences menu entry.
  
If you are wondering what that Last infected file was, I used a test virus file, EICAR.COM, that is readily available over the Internet and can be used to test if the antivirus is in fact scanning and detecting infected files.

While you could use clamav from both command line as well as using the ClamTK gui, most modern graphical Linux distributions also allow you to add a menu entry on the right click context menu to scan a file or a folder using ClamAV. Here is a screen shot from my laptop which runs Kubuntu 11.10 showing the context menu entry.



While no antivirus software will make your computer 100% safe and secure, they will certainly help. ClamAV is the antivirus of choice by most Linux server administrators and computer users. At the end of the day however it boils down to how responsible and cautious you yourself are. I hope you found this article helpful.

This article is part of the series ClamAV - Antivirus for Linux:
ClamAV - Antivirus for Linux
ClamFS - Antivirus Filesystem for Linux

You may also like:
Filesystem Encryption under Linux - EncFS