Danny's Blog: 2009

Friday, July 17, 2009

Creating a Remote Git Repo

Here are some steps I used to create a remote Git repo based on my local files.

Create the folder for the new application and change to that directory.
> mkdir /var/www/html/newapp
> cd /var/www/html/newapp

Make sure git is installed, if not install it
> sudo apt-get install git-core git-doc

Initialize the folder to become a git directory
> git init

********* Now switch to your local files ***********
> cd ~/myapp

Initialize git with the local folder as well
>git init

Tell git about yourself
>git config user.name "Your Name"
>git config user.email "Your Email"

Now, tell git which server to talk too
>git remote add origin username@yourdomain.com:/var/www/html/newapp/.git
Make sure that this user has SSH access to the box.

>git config branch.master.remote origin
>git config branch.master.merge refs/heads/master

Make your initial commit and push it to the remote server.
>git commit -a -m "Initial commit"
>git push origin master

And viola! Your files will be pushed to the remote server. Now, I was wondering why when I went to access the server, the files didn't seem to be in the working directory, even though I could see the log showed my Initial commit. It turns out that when things are pushed to master, they enter a limbo state. You have to do a git checkout -f, to actual have the commit become live. Note that doing a git checkout -f, will throw away any changes that have been made on the remote server.

To make the updating of the server easier, you can do the following:
> vi /var/www/html/newapp/.git/hooks/post-receive
Add - git checkout -f
Save the file
>chmod +x /var/www/html/newapp/.git/hooks/post-receive

Monday, July 13, 2009

2 Good articles for programmers

5 Tips for creating good code every day; or how to become a good software developer
http://www.makinggoodsoftware.com/2009/05/15/5-tips-for-creating-good-code-every-day-how-to-become-a-good-software-developer/

5 Top non-technical mistakes made by programmers
http://www.makinggoodsoftware.com/2009/07/07/5-top-non-technical-mistakes-made-by-programmers/

Friday, June 26, 2009

Read-optimize your source code

A good article about writing well-written code that will make it easy for other developers to quickly understand your code.

http://www.brendel.com/consulting/blog/2009/06/read-optimized-source-code.html

Another useful unix command to find text in many files

find . -type f -exec grep -il "text to find" {} \;

Will give you a list of all the files that contain "text to find", starting with your current directory and searching all sub-directories.

Tuesday, June 9, 2009

Useful unix command to find and replace text in many files

Here is a useful unix command I used today to find and replace text across multiple files and multiple directories.

find . -type f -exec sed -i "s/string1/string2/g" {} \;

This will replace all instances of string1 with string2.

Credit to Google and http://www.sb.fsu.edu/~soma/Proj/How-to/Search+Replace.pdf for helping me figure this out.

Wednesday, May 6, 2009

Koha April, 7th meeting

Galen plans to do a feature freeze at the end of August and release 3.2 at the end of October.

Before June we should be able to test BibLibre's acq work and Liblime will be reporting on there development work.

Galen will be doing a Release Manager blog for Koha 3.2. If Nicole has time she will take a first stab at improving the Koha User list welcome message to include a good Koha FAQ and links to helpful tips and guides.

hdl mentioned that Bugzilla can accept emails...interesting, I will have to look into this and see if it can improve my workflow in tracking and dealing with bugs.

All new enhancements will be tracked using Bugzilla in order to keep everything in one place. This will help the community be better informed about what other Koha libraries are sponsoring and what developers are working on.

Tuesday, April 28, 2009

Random Idea: Have PACMon manage computer updates

Writing this down so I don't forget.

Have PACMon display a list of each upgrade that a PAC needs, and include a way of ignoring certain updates, with option of viewing the ignore list. Then click upgrade and it will automatically send message to the PACs to get updates. Also include display of seeing which PACs need which updates. These tasks would be perfomed using apt and could include custom repositories if special versions of software are required.

Wednesday, April 22, 2009

Post-KohaCon news

From Galen Charlton,

(Re)starting a weekly update newsletter for changes to Koha 3.2
[Galen] will be calling a monthly IRC meeting to discuss the status of Koha development. The next meeting will be Wednesday, 6 May 2009 at 19:00 UTC.
Making some improvements to using the bug database to record who is working on various bugs and enhancements.
Setting a convention to tag regressions so that they get a higher priority for bugfixes.
Making use of the bug voting feature in Bugzilla to help Koha users and contributors prioritize bugs.

Can you help Koha?

The Koha project is seeking:

Quality Assurance manager(s), to review all patches that are sent in to uphold a particular level of quality among the Koha code base. It was suggested that perhaps this be a funded position through an organization or the KUDOS user group since it would take a significant amount of time. It was also suggested that if a QA manager can not be found that developers do more peer review and sign-off on each others patches.
Bug wrangler(s), to review the bugs at http://bugs.koha.org and make sure that old bugs are closed and all bugs are valid. The number of open bugs listed at the Koha bugzilla site is currently 827.
Funding or development of core code re-factoring, to clean existing Koha structure, make the entire development process more efficient, and increase code speed and readability. It is hard to find a specific library to want to sponsor an enhancement that would be something that only takes place behind the scenes, so it was suggested that this may fall to a group like the KUDOS user group.

These are just a few of the major roles that the Koha project is looking to fill. The project can always use more developers, documentation writers, translators, bug testers, and community adovocates. Anyone looking for more information on getting involved in the Koha project can visit http://wiki.koha.org/doku.php

Sunday, April 19, 2009

FRBR Diagram

FRBR diagram from code4lib - http://www.frbr.org/files/frbr-er-diagram-5.png

Koha Thinking Outside the Box

Do we need MARC records? We do need the ability to import and edit in MARC record format, by why do we store it in the database in that format? In this day and age where searching technology has advanced considerably, why are we still cataloging in a format that hinders search results and is much more time consuming then taking a Google approach and doing a full-text type search.

Caching would greatly improve performance. Using memcached would greatly increase this because it would store more information in memory. Nginx is a substitute to apache and apparently works very well with memcached and also would increase performance. Nginx talks directly to kernel and we be incredibly fast at serving static content. The NY Times profiler is a great tool to take a look where are the slow areas of a project. It would be great if someone could intstrument a way to make it easy to run the profiler against Koha. Idea for re-writing major dependcies of Koha like MARC and re-submitting it to CPAN. Using Yslow with Firebug extension is a good one to help determine what is taking a long time to load for front-end web sites.

Switch the sessions and zebraqueue table to MyISAM because innodb never shrinks an only grows and grows.

Koha Sponsoring Discussion

This is a late post, but I wanted to write down some information from the sponsorship talk that occurred on the last session of the Koha Con users conference(April 17, 2009).

There was a heated discussion in regards to the best way for the community to get better about informing each other what projects are being sponsored and how best to do resource sharing. It was agreed upon that the community would use the existing to Bugzilla (http://bugs.koha.org) to input enhancement work and sponsored development. Since Bugzilla doesn't include fields for all the information we'd like to see about sponsored development features it was suggested that there be a sort of template/example bug placed so that others could see all the information that needed to be included into the comments field.

Many people are hoping that with this method if someone doesn't have money to sponsor development or would like to help partially fund a development it will make it possible for others who are also interested in the idea the add feedback or even contribute.

I like this idea, however, ultimately I would like to see a better tool developed as it could have a very positive impact on the Koha community. If there were a way for vendors to bid on features directly from this tool and a way for libraries or individuals to contribute to the feature right from the browser, it would just make the entire process much easier and efficient. You would be able to have a meter or similar feature to say something like, "Only $200 left needed to sponsors this project". It will also make it easy for people to vote and make certain feature requests more of a higher priority without having to search through the thousands of bugs in bugzilla. This type of system could also have a method for people to say I have $1000 to offer for this project, who would be interested in developing it.

Does this type of service exist anywhere is there any type of open source tool similar? Could this be a website with support for many different projects to organize their development efforts?

Koha Dev Workflow Tips and Tricks

pieces "borrowered" from Nicole Engard blog post (thank you!)

Git
Chris - Git has a built in garbage collector $ git gc - if you run it after creating a branch and before checking it out, makes switch branches much faster (make a branch, run this and then checkout the branch).

Joe - has a script he makes on different servers that he uses to get his shell to where he wants it to be for testing, it sets his self created variables and standard values (http://blogs.liblime.com/developers/2009/04/19/simple-shell-trick/). Also he posted a good tip on the LibLime blog: http://blogs.liblime.com/developers/2009/04/10/simple-git-trick-for-bash/

Galen - One of the things that he does since he can’t claim to have the visual design skills that Owen does - he has become a real stickler about the HTML that is on the OPAC & Staff client - making it appear as XHTML. There are tools in Firefox that can make for a good development environment: Fire Bug, for validation the HTML validator plugin in excellent at doing it quickly without submitting your site to an online validator, Firefox accessibility plugin lets you run automated tests against your site to meet requirements for ADA, Also the web developer plugin, Yahoo! Dev tools are slow, but they provide valuable info. Something that is useful, but not a dev tool, is the Zotero plugin (citation manager) www.zotero.org — geared toward people to do lots of research online.

Unix
-screen command very useful! Look up example .screenrc files
-cluster ssh, ability to have multiple ssh sessions open and run all commands at same time http://sourceforge.net/projects/clusterssh/
-sshfs mount a remote file system
-nohup will run script in the background which is good for like doing a zebra rebuild in case your ssh session gets disconnected

grep for file and | to xargs

Saturday, April 18, 2009

Git tips

tig - cool way to look at commits

git send-email 0001-patch1 0002-patch2 etc etc

To get patches from email you can use mutt, fetchmail, etc. You can even use Thunderbird to get patches, by saving it as a .eml file.

git am -u -s -3 -i /path/to/patch.eml

-u = use utf-8
-s = sign the commit
-3 = use three-way merge, Galen says that this just makes it work
-i = use interactive method

Notes on Koha structure

Make sure that get_user_and_template is the first call and no actions are required before it. It is a security risk if any calls are made before that line because the user could run it directly if they know the url without authentication.

#!/usr/bin/perl
use C4::something;
use C4::something else;
user C4::Output;

(_) = get_template_and_user (..)

my $query (or my $input) = new CGI; //allows you to grab params

my $biblionumber = $query->param('biblionumber');

sub functions should follow the naming convention, GetSomethingAndSomething

Get
New
Mod
Del

$op = 'mod(get/new/del)' // often used means operation
output_html_with_http_headers; (nothing executed after)

Friday, April 17, 2009

Status of Development for Koha 3.2

presented by Galen Charlton

Goals for Koha 3.2 (an ambitious release)

New acquisitions module
Holdings support
Many circulation improvements - allowing to configure the circulation rules to the nth degree
Improving stability

RFCs - A statement of saying we want this bit of functionality or we (programmer,library, vendor) are going to implement this function for the next version of Koha. Sometimes the RFCs are wouldn't it be nice.

Not all RFCs will be implemented for 3.2. Some were just proposals, and others didn't end up being sponsored.

New acquisitions module

Developed by BibLibre
In production with one of their customers
Working on submitting their patches for 3.2
Review and testing period

Holdings structure

Developed by LibLime, sponsored by WALDO
Introduces "summary" records to Koha
Entering testing by WALDO
You will be able to load them and export then and in theory you can pretend that the 852 tag doesn't exist.

WALDO circulation features

Proxy patrons
Fines thresholds
Callslips, similar to request system
Recalls, if someone has an item on loan and a faculty member has it the recall will say return it at once
Hourly loans, meant to support having an item checked out for hours instead of days
Email checkout slips

Other circulation features

Calculate fines in days debarred, developed by BibLibre for 3.0 and will makes it way to 3.2 soon
Placing hold on multiple items, introduced by Alloy Computing (yay HCL and Alloy Computing)
Additional hold request improvements sponsored by NEKLS
Course reserves

OPAC enhancements

Support for enhanced content providers, Syndetics, LibraryThing, Babeltheque
Tag multiple items (yay HCL and Alloy Computing)

Cataloging

biblios.net marc record editor integration
Improved browse indexes
ISBN13 normalization (sponsored by PISD) - meaning if you only have 10 digit isbn it will compute isbn13 and visa versa
Item bulk status change, BibLibre working on this and Liblime also working on a global change
Brief records
Record maintenance
Deleted records - ability to still be able to search for deleted items and bibs in a specific context

Serials (for a non-library person like myself, what are Serials?)

General improvements to serials display and predication pattern management
More control over display of recently checked in issues (WALDO)

Administration

Improved system preference editor (Jesse Weaver, developer at small library)

Reporting

Improvements to the guided reports, mainly the ability to add placeholder mechanism. It will let you put in a parameter at the time of running the report instead of having to edit the SQL everytime.

Miscellaneous

Granular permissions (new acquisitions module already implements some of these)
Internet Explorer compatibility improvements (WALDO)
Improvements to overdue report (PISD)
Improved OAI-PMH server (Tamil)
URL checker (Tamil)

Galen, "There will be at least one cool new thing coming from the KohaCon development meeting this weekend"

Timeline
Not something that is finalized. The currently goal for 3.2 will likely be late summer or early autumn. 3.1 release for testing in early summer.

The tip of the Koha bleeding edge is always and odd release number following the Linux kernel version numbering system.

Stimulus package and Koha

It was mentioned that it might be possible to apply for Tech Stimulus package money to use for Koha development and expenses. It seems like it would be a great use of the money to go towards developing a free open source solution, which ultimately saves money in the future.

Is this something we can investigate further?

Customize your OPAC

presented by Owen Leonard

Owen works at Nelsonville Public Library and says that he wouldn't do it again. Although you can customize the templates very easily, it is a huge pain to have to maintain the templates.

Changing the stylesheet is the best way to customize the OPAC because it doesn't need to be maintained each time you upgrade.

Owen recommends a site called Listomatic, it gives you css and html code for many different types of menus that exist.

opaccolorcss = file name of customize css file
opacusercss = embedded in-line instead of a separate file

Tools that Owen recommends for Web Development:

Firebug - Firefox extension (must have)
Jquery - javascript language

Link to Owen's blog - http://www.myacpl.org/koha/

Thursday, April 16, 2009

Koha Integration: RFID, SIP2, LDAP

presented by Joe Atzberger

LDAP
List of LDAP Tools
1. Apache Directory Server & Studio (client) http://directory.apache.org
- Newer than openldap and more stable
- Runs on OSX, Win32, and linux
- Open source
2. OpenLDAP - http://www.openldap.org
- Includes command line tools

Koha LDAP does not go grab all your users as a "dump". That is what IMPORT is for. Instead it updates when they try to login.

Main server configuration goes in koha-conf.xml file. The line looks similar to:

ldap://auth.example.com:389
dc=example,dc=com really base, but base tag not allowed
cn=Admin,dc=example,dc=com
example

Bind-as-auth has been hacked into Koha, but not done cleanly enough to import into main Koha.

There are two options for replicate and update, which can sync the information between LDAP and Koha (very cool!!!)

Know your own Schema, example used was inetOrgPerson.

You can define data Koha cares about by using elements in koha-conf.xml.

CPL (this makes the default branch CPL)

3 kinds of required data include things that are required by the database, login, and us.

Database
- surname
- address
- city

Login
- userid

SIP2
Unlike LDAP, SIP2 runs as a totally different server process, normally on a completely different server.

Extra dependecies - UNIVERSAL::require and Net::Server:Prefork

Now a well-documented setup process in Appendix to the Koha Manual. Sipconfig.xml file contains info about the SIP server. In the Sipconfig.xml you can specify port, recommends using telnet so not to expose raw protocol to outside world. Make sure the user that SIP uses actually has correct permission to perform circulate functions.

perl -I./ ./SIPServer.pm /home/koha/sipconfig/Sipconfig.xml (example command of SIP server running)

Doesn't support things that are specified at the item level, such as, item level holds and all features that were added to Koha after the SIP implementation. Almost all 3M hardware, requires and extension to the SIP2 implementation and that is also not yet supported, but Joe believes that NEKLS will be sponsoring that change.

There was some question about can you use SIP over SSL. Joe mentioned that when the SIP specifications were written they were assuming that you were going to use a serial connection or atleast be on the same network as the SIP server. Apparently, it is possible to setup a Secure SSH tunnel and use SIP over that connection.

Koha Disaster Recovery

presented by Clay Fouts

Clay's primary job is to maintain the centralized repository for the Liblime hosted Koha installations.

Disaster is inevitable.... people will make mistakes, hardware will fail, malicious infiltrators can gain access. What is important is to understand how to minimize downtime, reduce frequency of failures and how to not lose important data.

How is data stored in Koha?

Koha source files and related configuration files
Perl dependencies
Zebra index content
MySQL database contents

There are a number of trade-offs you have to make when dealing with disaster recovery, such as, speed, expense, reliability, flexibility. For example, it is possible to get 99.99% up time, however, the expense of setting up a system to support this kind of up time will cost lots and lots of money because you will need multiple redundant servers, with raid hard drives and built-in automatic failover.

Storage Media
Disk(Hard Drive) - Fastest way to move data, but not very portable and still expensive compared to other options
DVD/CD - Extremely cheap and portable, but doesn't hold lots of data.
Tape - Portable and supports more data then CD/DVD, but not as much as most hard drives. Very expensive...
Cloud - unlimited capacity, redundant, so you will not lose data. Nice because you will not have to maintain another piece of hardware. Limiting capacity is network bandwidth.

Simply making backups is NOT sufficient, you need to make sure that the backups actually work and that you are able to restore completely from them. Verify that the media doesn't degrade, that it is secure, and accessible. If you store the data off site will it be accessible in the middle of the night if you need to do recovery?

Where to Store the data?
Onsite - fast, cheap, easy access, but if the place burns, you lose everything.
Offsite - Still have backups if your place burns, but it is difficult, expensive and slow access if you ever need them.
Cloud - can be the best of both if your network supports it. Most providers have redundant disks stored at multiple facilities spread across the world.

Optimal strategy is some kind of combination, for example, storing onsite if you need quick backups, but also storing the data offsite incase of a disaster.

MySQL Data Backups (most important)
Most sensitive and impossible to rebuild it manually because you don't know who has checked out what, who owes what fines, etc. Should be backed up frequently using multiple methods. Recommends using MySQL's bin logs to back up data.
Logical using tools like mysqldump. Provides the ability to rebuild a database from raw SQL statements. Very portable and can even be moved to another platform. Slow to back up and restore. Database is inaccessible during backup operation.
Binary using tools like cp, dd, LVM snapshots. Faster than logical backups. Still blocks access during the backup, but for less time. LVM snapshots can be taken almost instantaneously. Need to script so that you prevent access to database while backup is taken because it does not do it inherently otherwise you will get corruption of data. Binary files and disk partitions are much larger than logical SQL backups. The backups are difficult to verify and less portable.
Mirroring using master/slave replication. Fatest because it's ongoing. The replication server can be used as a hot spare in case primary server goes down. Very useful in combination with logical and binary backups. Has other uses, for example, reports in which you can have the reports all generated based on the slave server and it won't degrade the performace of the master server. Can be useful in combination with logical and binary backups because you can do backups based on the slave, so it never locks up the master server. Can introduce inconsistency in regards to storing timestamps, etc. To combat the inconsistency is make sure that the master server is not overloaded and prevent network latency.

Code base backup
Useful if you have customized the data.

Zebra Index backup
Huge database and not unique, no data loss if not backed up, but rebuilding will increase recovery time. Could take hours to rebuild the indexes if you have lots of data. Make sure that you are not adding to the index while the backups are going on.

Public Library Lunch Group

We walked over to the mall across the street from the conference center to have lunch (very nice mall by the way). After finally getting my food, I was fascinated to hear some of the stories about the other public libraries using Koha and migrating to Koha.

I talked with two ladies from East Brunswick Public Library and was fascinated to hear very similar concerns to our own. They are currently using Horizon 7.4 and plan to go live on Koha in May (?). They mentioned that they plan to use Horizon to do cataloging and acquisitions since they found it difficult to currently use Koha to perform these tasks.

Several people at the table also expressed concerns about not knowing what all these different vendors were doing and how do we know we are all not paying for the same feature? We would like the vendors to communicate more with the community in terms of what features are being developed.

SQL Structure

presented by Paul Poulain

authorised_values table used to define lists available in a lot of places. Paul, "the swiss knife tool in Koha"

Don't try to fill the biblios and items tables directly, use bulkmarcimport.pl tool or the API (more complex).

Koha is setup behind the scenes to handle both UNIMARC and MARC21. Lots of data in MARC21, very few important and very frequently used like title, author, call no.

History

Koha 1.x was not dealing with MARC.
Koha 1.x implemented something like FRBR before FRBR existed.
Koha 2.0 implemented MARC but relied on the previous 3 tables

Biblio and Biblioitems tables
Raw MARC data is stored in biblioitems table in the marcxml field. Some information is stored in a clearly named field in biblio table. The biblioitems table that contains information that is the root information of the edition.

In Koha 3.0 you should have a 1 to 1 correlation between the biblio and biblioitems table. The two tables could be merged into one, but has not been implemented yet. Koha takes care of translating the MARC database into the clearly named field, that is why you must use the import tool or the API.

The biblioitems table should be called biblioeditions, but no developer has renamed it yet.

Which table is used..

On results lists, MARC detail from marcxml
On standard biblio detail screen, if you have XSLT the data is taken from marcxml (?)
For UNIMARC detail screens, the biblio and biblioitems tables are used.
If just a few key fields, it is grabbed from biblio and biblioitems tables

If you crash Zebra database, everything is stored in the MySQL table.

Items table contains a row for each representation of the physical item in the library. All information is contained in the biblioitems.marcxml field, however, just like biblio table contains decoded information, the items table also contains some specific decoded fields.

In Koha the primary key for biblio data is the biblio#.

Borrower table
Surname is the only mandatory field in the borrower table. Personal information, first name, title, other names, intials, dateofbirth, address, street number, street type, address, address2, zip code, city , phone, email are all examples of fields in the borrower table.

The primary key for borrower table is the borrower#. The password in Koha is never stored in plain text and only stored in the md5 hash form.

Some free to use fields are sort1 and sort2, linked to authorised_valules. Borrower_attributes table supports extended attributes.

Issues table (checkouts/checkins)
All current issues are stored in issues table. All previous/finished issues are stored in old_issues table. They both have the same structure.

Four important fields, borrowernumber, itemnumber, date_due, issuedate

History of Koha

presented by Chris Cormack and Paul Poulain

Chris is one of the original developers of Koha in 1999. There was no suitable responses to the RFP as they needed an application that worked over slow connections. Using a web based application was a natural choice because it works well over slow connections. There was no open source solutions at the time and open source was a natural choice because neither HLT or Katipo wanted to be a vendor. Developed in rapid prototype with lots of refining.

It's just a database... but with lots and lots of rules.

History of Koha: http://stats.workbuffer.org/history.html

When they first released version 1.0 of Koha, the first copy was downloaded in 20 minutes. In 2002, Paul Poulain got involved and Owen Leonard from Nelsonville Public Library.

Chris soon realized that when working with libraries, standards meant "standards" and he often had to ask which MARC.

In 2003, the community continues to grow, more developers, more translations. NPL went live in 2003. The NZ prime minister mentioned Koha in a speech.

In 2004, the first koha docs were written at www.kohadocs.org and first time a non-technical person contributed to Koha.

In 2005, Liblime got on board with Koha to support U.S. libraries. Paul formed a partnership with Henri Damien Laurent to meet the demand for Koha support. Decided to go with Zebra for full-text searching. New www.koha.org site released.

In 2006, Chris presented about Koha at Linux Conference Australia and Linus was in the audience! First, KohaCon held in Paris.

In 2007, lots of talks with Liblime about hiring Chris and several others. Started Koha Days to fix features that no one was sponsoring, but needed to be done. Koha won a NZ Open Source Award, despite being in the same category with the guy that wrote Ruby on Rails.

In 2008, Nicole did huge amounts of work on the Koha 3.0 manual. Paul and Henri formed Biblibre. Chris left Liblime and now works for Catalyst, but is the translation manager. Chris says, "2008 should be called the year of India", as major libraries including Dehli Public Library switched to Koha and several major companies to support Koha.

Danny's Blog