EN's encounters in ICT and East Africa: 2010

Monday, November 15, 2010

Using yum behind a proxy

I was recently attempting to install a software on centos 5 and ran into an error with the yum install command as shown below:

Could not retrieve mirrorlist http://mirrorlist.centos.org/?release=5&arch=x86_64&repo=os error was [Errno 4] IOError: <urlopen error (111, 'Connection refused')>

Setting the http_proxy did not help matters. I found documentation on forums that suggested adding the line proxy_http=http://proxy:port in /etc/yum.conf. That still did not help. I finally came across a post on centos forums that suggested that the setting to be made in /etc/yum.conf is actually proxy=... and not http_proxy=...

Tuesday, September 14, 2010

Image Locations for Pentaho Reports 3.0

My experience with pentaho's report designer is that at design time, images used in the report can be at any arbitrary location. On publishing them to the XML format, each image used in the report design (.report file) is placed in the same directory as the output file but with a new name in the format - Report_staticImage[N].[ext]. N is a number from zero depending on the number of images while ext is the original file extension e.g png.

This behaviour is bound to cause trouble for someone using different locations for the development and production environment. Ideally, the production environment should be setup to reflect the local or vice versa but sometimes this is not possible. To get round this, the production path should be created on the development environment and the desired solution folder linked to the repository location.

For example, let the production path be /path/to/prod/pentaho-solutions/solution1. In this case, the solution1 is the directory containing .xaction files and report templates (.xml). Let the development path be /path/to/dev/pentaho-solutions. A link should be created from the duplicated/replicated production path to the actual repository location on the development environment.

This means that the report designer and pentaho design studio on development will manipulate the files in /path/to/prod/pentaho-solutions/solution1, resulting in paths that will match the production environment on deployment. The local pentaho server, which in my case knows the repository location as /path/to/dev/pentaho-solutions, can access the solution directory as a link - /path/to/dev/pentaho-solutions/soft-link-to-solution1.

Sunday, September 5, 2010

Katiba-Info Issues

Please report your issue (bug, error, feature request) as a comment on this blog post and I will endevour to create a separate blog post where a more detailed discussion can be held and updates posted. You may want to browse through the list of issues first before adding a comment. Thank you for the feedback.

Katiba-Info Issue: Text Wrapping

Text wrapping is not handled gracefully when the chat/email window is narrow. The solution for now is to resize the chat window so that the embedded line breaks can be moved to the end of the line. I am working on a removing the embedded line breaks and white space and the direction I am leaning to is using a regular expression. Suggestions are welcome.

Narrow chat window

Wide chat window

Katiba-Info Issue: Unicode characters

Email and chat output is unable to display unicode characters as shown below. I am reading an XML file into a cacheable byte array and thereafter wrapping that array in a ByteArrayInputStream which is in turn wrapped in a InputStreamReader with UTF-8 encoding for use in the XSLT transformation. Suggestions are welcome.

Friday, September 3, 2010

Kenya Constitution Lookup via Email, Chat

After several weeks of coding and hacking, I hosted katiba info on google app engine on Sept 1st. The applications is envisioned as a mobile friendly site that will allow articles and clauses in the Kenya constitution via email and chat.

Sept 1st was long after my self imposed deadline to have it online before the promulgation ceremony. The idea had started as a desire to have the constitution online in a way that people could access in a simple way. Katiba.mobi was already online but I wanted to utilize the app engine platform for its scaling capabilities. On an earlier project (Kura Info), I had used python and django and had been impressed by that combination which greatly simplified web application development.

Having decided that the user interface part will be a walkover, I turned to the data where I converted the PDF file to text using pdftotext. The task that faced me then was how to delimit the various sections of the document - chapters, articles, clauses, parts and so on. The articles are numbering is not restarted at the chapter level for the body of the document, so I extracted that portion for further analysis, leaving out the TOC and appendices. The complexity of breaking down the text file was dawning on me so I turned to a compiler construction book for ideas on how to automate the process of building parsers. The book led me to use flex for my first stage where chapter headings, article headings, article numbers and clauses were sorrounded by XML tags.

The next step according to compiler theory would have been to create a grammer, tokenize the file and feed it into bison. At that point, I decided to take a short cut by using regular expressions in python to finish the task of surrounding chapters, part and article sections with new XML tags. With an XML version of the constitution available, my original plan had been to use it to create another XML file in a format suitable for uploading to the datastore. This approach was abandoned due to the additional complexity and time pressure. I decided to use XPATH to mine the desired information. I quickly found out that XPATH support on google app engine (python) was only available as a third party library and decided to switch to Java due to its XML processing facilities.

I did not expect a repeat of the ease I had experienced while using django for the user interface so I went for a user interaction medium that demanded the least in terms of UI development - email. When email started was ready and the application hosted, adding chat was a straightforward process.

Referendum Reflections

The referendum results are long out and the new constitution promulgated. I felt discouraged that Kenyans did not heed the call of the church as communicated by the clergy. Before the referendum, a radio host on Hope FM had asked for what listeners thought would happen after the vote. The text below is what I wrote in.

Am expecting after the referendum:

- That many 'Sauls' will preach the gospel they persecuted,

- That those who plot for the downfall of the church will be used for its promotion,

- That the evil meant against our country will work for good,

- That where the enemy planned defeat there be victory.

- Eric.

In my state of discouragement, I remembered my response above and decided to focus on its prophetic perspective which is based on the biblical accounts of Joseph and Esther. Joseph faced opposition and ultimately rejection from his brothers, yet God made it work out for his good and his family. For Esther, it was the entire community of Jews that were endangered. Her intervention, prompted by Mordecai's counsel, left the Jews stronger and elevated them higher.

In Esther's scenario, there is a gem that is especially shines in the current constitution dispensation. It is written that Mordecai was remembered and honored by the King because he had once uncovered a plot against the kings life. In the same way, I believe God will remember and honor those who voted to save the lives of our unborn children even if their action not appreciated or recognized right away.

Monday, August 30, 2010

Postgres DB Management Server

While trying out Postgresql (EnterpriseDB 8.4) database, I wanted to see what the management server has. Looking through the enterprise DB documentation did not turn up the port number. I used lsof to list the ports on listen mode. This action turned up several ports between 9011 and 9019. At this point, I tried some of them on the browser but none worked possibly because of the local firewall.

After several unsuccessful searches, I navigated through the installation folders and realized that the management server run on jboss. I then made a specific search for Jboss http port configuration as I my experience was with tomcat only. It was a pleasant surprise that Jboss uses tomcat for its web apps. Looking through $EDB_HOME/mgmtsvr/server/default/deploy/jboss-web.deployer/server.xml revealed that the HTTP port is 9011. I added this to the firewall exclusion list and I was ready to explore.

At first sight, the management server does not appear to have many features. This is especially since I have interacted with the Oracle enterprise console. With that ended another step in my journey with Postgresql.

Monday, July 12, 2010

Getting Started with Amanda

Having initially considered Bacula, I chose amanda for use in backup due to it's better organized documentation. Being a newbie, I selected the quick start getting started tutorial as the first step through the online documents.

In that process, I consulted the bsd authentication howto for more information on authentication. I also found the vtape howto very helpful since I did not have a physical tape device. Having gone followed the tutorial and branching into the howtos at the relevant sections, I was able to do a successful test run.

Using another article in the howtos collection, I made further changes to the configuration to allow for dumps to be split across multiple tapes (vtapes in my case). It was then time to follow another howto to duplicate that configuration to the production environment where the virtual tapes needed to be configured for use on a non-Linux partition.

Thursday, July 1, 2010

Creating a Pre-printed Report Template

One way to speed up printing receipts on a dot-matrix printer is to print the desired (dynamic) information on a pre-printed template that contains the labels, content placeholders, logos and other static information.

The first thing I did was to adjust the report to display the values only as it has previously been printing all the dynamic and static information. I deleted all the static information and left the bare dynamic info that changes with every print out.

I then printed out the bare report on the target dot matrix printer. In the process, I made adjustments to paper guides on the printer so as to get an optimum placement. I repeated this process several times as the draft print outs revealed changes that needed to be made to the report writer design.

When the bare printout had an optimum placement, I took the printout and scanned it. I then opened the resulting image with Gimp Image Manipulation Program. Using gimp, I added layers which contained the static information e.g. text layers for labels and a transparent layer for content placeholders. I even added a layer with a white color that I would hide/show to see how the report template would look like on printed paper.

All this time, the original scanned layer was acting as a background which helped with the positioning. Thereafter, I hid this original layer (leaving the all white layer as the background) and printed the resulting report template on paper. Now it was time to test it by printing out the bare information report on the preprinted template. After a few minor adjustments and refinements made from observing the data placement on the template, the final version was ready for mass production.

Friday, June 11, 2010

Workaround for uploading data to google app engine dev server

Google app engine's dev server will ask for a google account to upload data as described in the uploading data section of the online docs.
I found this requirement restrictive since I did not have a internet connection at the time. My workaround was to comment out a section of the file that handles upload requests i.e. $PYTHON_LIB/google/appengine/ext/remote_api/handler.py.

The following is the result of running diff handler.py.dist handler.py where the first file is the original file and the second is the modified one.


295,296c295,296
<     if not self.CheckIsAdmin():
<       return
---
>     #if not self.CheckIsAdmin():
>     #  return

The appcfg.py script will still ask for an email and password after the change but any fictitious value will be accepted.

Wednesday, May 26, 2010

Making ProxyPassMatch work with Location in apache2

I have recently created a configuration where apache2 proxies requests to pentaho running on a tomcat backend. The first problem I faced with the ProxyPassMatch directive is to do with the way the url pointing to the backend is written. Specifically, using a url that includes a non standard port like 8080 may result in a ProxyPass Unable to parse URL error. I fortunately found a workaround for that bug on the apache bug list.

When ProxyPassMatch is enclosed in Location - the first argument - which is a regex pattern, is left out. It is used as an argument to the Location directive instead.

The url provided to the enclosed ProxyPassMatch should be combined with the expected result from regex pattern. This is done using $n where n is count of parenthesized matches starting at number 1.
If the parenthesized matche(s) are not provided to the url argument of ProxyPassMatch, the whole matching string will be concatenated with that url and this may result in an unintended path.

Example


<Location ~ ^/(pentaho.*)$>
    Order Allow,Deny
    Allow from All
    ProxyPassMatch ajp://127.0.0.1:8329/$1
</Location>

Monday, April 19, 2010

Restarting Pentaho Automatically - the quick and dirty way

I look after a Pentaho 2 deployment which manifested a pattern of freezing after about a week. That appears to be a weakness in the code where an object grows bigger and bigger and is not garbage collected, leading the JVM to hang while attempting a full GC.

Not having the expertise to profile the memory usage on the pentaho system, I decided to have the application server restart every few days. The first issue I faced is that the provided stop script - stop-pentaho.sh returns immediately after being invoked and does not guarantee that the application has been stopped.

Compiling jsvc (available in tomcat's bin directory as jsvc.tar.gz) enabled me to have a guaranteed way of stopping tomcat. The following script shows the contents of a script I named stop-tc.sh. This script accepts the tomcat HTTP listen port as the sole argument.


#stop tomcat whose port provided at command line
tmp_file=/tmp/tc-stop.pid

#use lsof to get PID, write to temp file and call jsvc
lsof -a -i TCP:$1 -c java | grep LISTEN | awk '{print $2}' > $tmp_file && /usr/local/bin/jsvc -stop -pidfile $tmp_file org.apache.catalina.startup.Bootstrap

I thereafter turned my attention to the hypersonic database that holds pentaho admin information. There should be a command to shutdown the database, but I decided to kill the process and delete the lock files. The complete restart script, which references the one above, is given below.


#! /bin/bash

export JAVA_HOME=/usr/lib/jvm/java-6-sun
#stop tomcat
/storage/kuali-scripts/stop-tc.sh 8286
#ensure tomcat quits - the file involved is created by stop-tc.sh above
kill -9 `cat /tmp/tc-stop.pid`

#kill hypersonic
kill -9 `lsof -a -i TCP:9001 -c java | grep LISTEN | awk '{print $2}'`

#delete hypersonic log files
rm /storage/financials/pentaho2/biserver-ce/data/hsqldb/quartz.lck
rm /storage/financials/pentaho2/biserver-ce/data/hsqldb/hibernate.lck
rm /storage/financials/pentaho2/biserver-ce/data/hsqldb/sampledata.lck

#start pentaho
/storage/financials/pentaho2/biserver-ce/start-pentaho.sh &> /dev/null

Tuesday, March 23, 2010

Matching Numeric Ranges with Ruby

regular-expressions.info have dedicated a page to how this can be accomplished using regular expressions alone. This approach is limited in the ranges that can be matched and the patterns add to the complexity of the expression needed to match desired lines.

Realizing this shortcoming of a pure regex approach, I decided to follow a programmatic one with ruby. Here is a command that will display Java VM (run with -verbose:gc) Garbage Cleaning output with a pause time of between 2 and 5 seconds.


cat ./logs/catalina.out | ruby -e 'STDIN.each {|line| puts line if line =~ /\[GC [0-9]+K->[0-9]+K\([0-9]+K\), ([0-9]+\.[0-9]+) secs\]/ and (2..5).include?($1.to_f)}'

Ruby receives a standard input a line at a time and checks it against the regular expression. If a line matches, the part of the pattern enclosed by un-escaped parentheses is made available in the variable $1. That variable is converted to float and compared to the given range. The line is then printed to standard output if the value falls within the range.

Friday, March 19, 2010

Password-less SSH Login

A google search will turn up several good step by step procedures to accomplish this. I realized after a few futile attempts that the file that is mentioned as authorized_keys in many places should actually be authorized_keys2 as described here.

I think the difference in file naming may have occurred due to changes in the OpenSSH tool set and a review of the change logs may be needed to confirm at what version the change was introduced.

Monday, March 15, 2010

Resizing an LVM /usr partition in OpenSUSE 11.1

The yast2 tool makes resizing lvm partitions a very straightforward matter. The only requirement is to unmount the partition being resized. To achieve an unmount of /usr one has to boot to single user mode since there are many process which have open files on that partition. Running lsof /usr | wc -l as root on one instance turned up 2588 files!

It turns out that after running umount /usr successfully in single user mode, yast2 disk fails since there are some files that yast2 needs in the unmounted /usr partition. This left me only with the option of using the console tools. Below is the sequence of commands that I run to add 5G to my /usr partition.


modprobe dm_mod
lvextend -L +5G /dev/mapper/system-usr
shutdown -r 0
resize2fs /dev/mapper/system-usr

The first command to load the device mapper kernel module was necessary since the module was not loaded automatically in my case. I was only able to resize the file system after a reboot because I could not figure out a way to mount the partition in single user mode but I believe it should be possible.

Monday, March 8, 2010

Generating a Tomcat PID file on the fly

I needed to use jsvc to stop tomcat processes, something that required a PID file containing the Process ID of the tomcat being stopped. My first alternative was to use jsvc to start Tomcat and specify the PID file to generate. Looking at the myriad of command line options that jsvc needed, I decided to stick to the provided startup.sh.

This left me with the option of generating the PID file after Tomcat had been started normally. For this, I used the lsof command of which I found an excellent reference. This is the command I used:


lsof -t -l -a -i TCP:$1 -a -c java > $tmp_file && /usr/local/bin/jsvc -stop -pidfile $tmp_file org.apache.catalina.startup.Bootstrap

The line above consists of two commands, one where lsof writes Tomcats PID into a file and the other where jsvc uses that file to shutdown Tomcat. Here is a break down of the options to lsof.

-t output the PID only
-l show ports that are in listening mode
-a and (this enables a boolean relation to be constructed between options)
-i TCP:$1 show a process with an open TCP port held in the shell input variable
-a and
-c java show processes with the name java

Pitfalls of stopping a running script in Oracle SQL Developer

When Oracle SQL Developer is executing a long running script it usually displays a progress bar with a 'cancel' button like the one on the right displayed next to it.

Recently, I erroneously launched a script and hurried to press the said button. Sure enough, the progress bar disappeared and the cancel button was greyed out. Still, I went ahead and pressed the roll back transaction several times.

It was a great shock to me that the 'cancelled' transaction was never stopped, and the data had been posted to the database. The roll back has obviously not worked as well, possibly because SQL Developer considered the script as 'cancelled'.

My two cents on this is that if a transaction that does not commit automatically is started erroneously, let it complete successfully, then roll it back.

Thursday, February 25, 2010

Enabling clustered tomcat mcast port in OpenSUSE firewall

The default firewall configuration will drop the multicast packets needed for tomcat instances in a cluster to 'ping' each other.

Open the firewall application as root, and select Broadcast on the navigation panel on the left.

Click the add button, select the active firewall zone and select 'User Defined' under service. Finally specify the protocol as UDP and provide the port number in use.

Click add then save the settings which take effect immediately.

Monday, February 15, 2010

Reading Standard Input in Ruby Scripts

I recently spent a few agonizing hours struggling to get a ruby script to work.
The script was iterating over standard input and subjecting the input line to a file/dir test. Valid files were not being recognized.

I later realized that the cause of the problem was the newline that is included in the input received from standard input. After stripping away the newline, the script worked.

Copying a Pentaho Report Repository Manually

Copying a existing report repository is an easy way to create another repository for use with a test or development database. It can also be a way to create a production repository from an erstwhile development version.

These are the steps I used on Linux:

Copy the repository
```
cp -r repo-prod repo-dev
```
Edit the index.properties file in repo-dev and change the name property to give it a different name than the original.

Change the data source name to match the development one in each xaction file
```
find . -name '*.xaction' | xargs sed -i 's/data-prod/data-dev/g'
```

Change the repository name in xaction files that have sub-actions
```
find . -name '*.xaction' | xargs sed -i 's/repo-prod/repo-dev/g'
```

Change the image paths in the report design .xml files. cd into the directory first
```
find . -name '*.xml' |xargs sed -i 's|/old/prefix/|/new/prefix/|'
```

Monday, January 18, 2010

Invalid column type in Pentaho

A java.sql.SQLException: Invalid column type error in pentaho had broken a previously functioning report upon the introduction of changes.

On close inspection, I traced the problem to a newly introduced a parameter in an SQL statement. The parameter was supplying data for use in a SQL IN(...) clause. When a variable of type string-list was supplied, the report worked. The exception above was thrown when a variable of type string was supplied.

To resolve the problem, I created a Java ArrayList to hold the single value and hence ensure that the variable supplied was always a list.