Tuesday, January 17, 2006

Building a multi channel retail infrastructure with open source software.

This is a paper to describe how we created a multi-channel retail infrastructure with open source software and commodity hardware. This is an ongoing project that is now in its fourth year and we are experiencing some levels of success in each of the areas in which we operate. I hope that this will be of interest to the community at large, and that it may generate some discussion and interest which may allow us to improve our setup and investigate new areas which we have not considered.

A multi-channel retail infrastructure is a situation where you have a number of marketing channels (the diagram shows three of these (web, telesales and stores)) which are co-ordinated so that you can view them as a single entity.

This has several advantages. Effective, cross channel customer relationships can be established. Savings can be established in both the supply chain and logistics functions and a more holistic view of the enterprise can be created allowing the staff to view the business in different ways which may present new opportunities. The management of the systems and business is also simplified and process updates are more easily applied.

Who are we and what are we trying to do?

We set up Ethical Shopper with a number of objectives in mind. The principle objective is to allow consumers to shop according to their own 'ethical' criteria. This would mean say, that a vegetarian would not be presented with products containing animal remains, an individual opposed to the tobacco trade would not be presented with products made by companies involved with this business (an example being Kraft Dairylea - a cheese product that is owned by Phillip Morris Inc, the worlds largest tobacco company), those opposed to the arms trade would only look at stock that originated from companies whose business excluded the weapons business etc.,

The original founders of Ethical Shopper were all active in the technology business before and during the 'dot-com' boom of the 1990's and learned a considerable amount about a wide range of issues related to retail, banking, telecommunications and the application of technology in these areas. We were all affected by the events of the 1990's and we are keen to establish a new business model for the 21st Century that will reflect more accurately the requirements of stakeholders in the business, as we believe that this is a more efficient as well as 'fairer' way of running and developing the business. We are also committed to the ideas of Business Process Re engineering as described in Michael Hammer's excellent work 'Re engineering the Corporation' which describes some of the principles that we would like to apply in this new business model.

In order to reach our final objective we have broken the evolution of the business into segments. Because of a specific interest in the Fair Trade movement (for further information please refer to the Fairtrade Foundation, the association of fairtrade shops, the International Fairtrade Association, in the UK or Transfair in the United States or in Continental Europe) we decided to initially set up a retail infrastructure supporting this. We are based in the United Kingdom so we initially set up links with the association of Fair Trade shops which currently has over 100 retail outlets across the country. The links with Europe will also allow us to expand the idea once we have established it here. Furthermore we would like to involve some of the 850 Oxfam shops based in nearly every town in the country and the 2,500 Co-Operatives in due course once our systems have been proven and we have established working relationships.

So to our current position. We have a good base system which is currently taking orders from the private, commercial and Government sectors although in fairly small quantities at the moment. Our system, based on Debian, PostgreSQL and J2EE has operated with less than five days cumulative downtime over the last three years and we have taken orders from Eire, the US and Canada as well as this country. The system supports orders through the web site, orders through a local rate telephone number, postal orders (using xsl:FO we can produce a printed catalog directly from the system itself) and from a store in Upper Street in Islington (soon to get a bar code reader). The entire operation runs on a very low monthly expenditure (excluding stock purchases, audit fees, property rental and carriage costs) and currently has over 380 registered users (this is a result of no marketing at all apart from word of mouth and through our storefront in Angel).

Why Open Source?

A number of charities. SME's and NGO's were severely affected in the .com bubble when they allowed salesmen to tie them into proprietary systems. This meant that a very specific skillset was required in order to maintain these systems (often only available at premium rates) and the full power of these systems was only available with upgrades to the base system usually requiring extra payment. At an early stage we realised that our target market was not in a position to pay large up front costs and would be best able to provide compensation for work done through profits specically attributed to technology and process improvements. This therefore required a very low start up cost and an ongoing low burn rate (the rate at which a business or department consumes cash in order to maintain itself in a trading position).

We had been involved with open source software for some time (since the mid 1990's and the emergence of the Linux operating system) and realised that these could be used to provide savings in a number of essential areas. Because there is no cost for each additional server running Linux this meant that we could make use of machines that are now well below the minimum ideal specification but that were available at low cost (or in some cases free) from a range of sources (auctions, E-Bay, second-hand shops, charity outlets and machines which friends and family no longer find are suitable for their needs). This meant that what we lacked in reliable hardware we could make up for in redundant systems.

However this is not without its associated costs. Older hardware is, by its nature, more unreliable and this requires more complicated configuration as failures need to be detected and worked around fairly quickly. The older machines also require more power (Amps) for which they deliver less performance and this has caused us to start dropping some of our really old machines.
The lack of any per cpu (or other) license fees mean that we can afford to add services to a machine which would not be economically viable should a license be required for it.

A further advantage is that our use of software based on open standards means that our partners are not limited to using only our services in the future should circumstances change. The skills that we could provide could equally be provided by generic DBA's (database administrators), designers and programmers albeit with a fairly steep initial learning curve. These skills would also be available from others working on the open source projects that we adopted thereby reducing vendor lock in to our partners and reducing their risks accordingly.

Finally we wanted to work with systems that we could show off. We have all worked on closed systems that we could not demonstrate in the public domain and this work is now behind the firewalls of the corporations that we served, unattributable and anonymous. We consider software engineering to be a form of art and it means a great deal to us to be able to show our peers what we have accomplished and show them what we have created.

Choice of operating systems and why we chose Debian GNU/Linux

Originally we ran Red Hat Linux on our servers but found that we were using only small sections of this in each build. Our tech lead had been looking at Debian, BSD and Slackware for some time and each of these had their advantages. As our setup became more complicated the RedHat standard build applied less and we started to use the minimal install more and add components as required. The Debian philosophy had appealed for some time (using only GNU components in the build) and, whilst the more technical members of our crew were comfortable with this, I was less capable and had numerous installation issues (x related) with it while trying to run it as my workstation which made me somewhat thoughtful about using it as our principal server base.

However, after installing it without problem on a number of servers I found it actually very straightforward and the reduced number of packages was helpful insofar as it was simpler to maintain. There are various issues with Debian as there are no proprietary extensions within the build and these had to be added. The apt-get command is only available for some of the packages we use and although we have become proficient in installing packages from source this also throws up a number of issues (less if you establish standards at an early stage for directory structures, naming conventions and startup scripts).

In the end since my (the principal user) experience has been mostly Linux based it didn't make sense to start down the BSD route with the unknown problems that this might present in spite of the advantages in security that this offers. Slackware, although Linux based, presented a similar issue and Debian was chosen as our new base.

Security - Linux firewalls

The principle attraction of BSD was the very high levels of security that this offered. Although we do not currently retain credit card or financial information we do hold sensitive, customer specific information which must be held securely if we are to maintain our credibility. As time goes on and we aquire more data this will become still more critical. As most of our technicians come from a banking background this has allways been uppermost in mind and so the first machine (and, until its recent replacement the longest serving) to be setup was a firewall based on Linux. Our setup was actually very similar indeed to the setup detailed in a recent issue of Linux User Magazine and the principal activity was to close off all extraneous ports and set up access controls and logging. Although we have a first level hardware firewall (our concession to the proprietary world) I don't propose to discuss this here and our development effort is dedicated to our linux firewalls.

Routing, proxying, caching and management support.

Currently our connectivity is provided by an adsl service which provides uplink of 400K which is sufficient for our current requirements. We have also recently entered into negotiations with an ISP which will give us considerably more bandwidth if required as well as redundancy which is something we would value (nearly all our downtime at the moment is due to errors on the line although this has recently dramatically improved with a new service). We use a number of ports on our boxes as we make full use of ssl for client security. The routing internally is accomplished using IP tables. We also use Squid as the proxy which also helps with our internal network and caching. The internal network is increasingly rarely used as most of the contributors to this project work remotely. The technical users are fine with an ssh client (putty pricipally) and the non tech users are provided with management screens to the appropriate applications (more on that later on). The application server contains it's own logfile analysis tools (currently under development but some stats are available) and we use Analog and Report Magic. These tools are covered in some depth in other articles and so I wont dwell on them here but move swiftly onto the next layer of our business infrastructure.

Choice of database. Our options and why we chose PostgreSQL.

Our choice of middleware gives us a wide range of databases to choose from and connectors are provided to SAPdb, MySQL, Hypersonic, MaxDB, and Oracle. We chose Postgres over the other alternatives for a number of reasons. Postgres is a fully ACID compliant database which has proper support for transactions (particularly important for ecommerce). It has been going for some time (since 1986) and has an active community around it. It is very efficient and doesn't suffer from speed or memory problems and the jdbc driver has been stable for a while and also has an active community around it. In addition to this it uses a standard type of SQL which should improve portability in the future should that be required. A close contender for us was SAPdb which used to be the database for the SAP ERP product until the consultants chiefly recomended Oracle and so demand was reduced. This has recieved a lot of attention on the list and, particularly with the recent merger with MySQL, is one to watch.

The setup for this system was very simple but there were a few issues which I will comment on. Firstly I think that it is definitely worthwhile to upgrade to the latest version of Postgres. This is considerably faster than previous versions (with this application anyway) and it has improved support for JDBC. Debian does not come with the jdbc drivers as these are not GNU approved and so the database will have to be recompiled with these (the flag is --with-java). If you are building from scratch then this won't be a problem. In addition you will want to ensure that you have a recent version of the JDK (1.4 or thereabouts) and Ant and these will have to be added to the path in order to be picked up by the compiler.

In due course we hope to make more use of the javamail component of j2ee but at the moment we are using QMail as our mail server and Squirrelmail as our mail clients. As our workforce is geographically diverse and use a range of different machines we have standardised on web standards and try and make all of our services available using a web browser. These two packages help us accomplish this for email and they are very heavily used indeed.

In my next posting I will speak more about our application layer for which we use Ofbiz which provides us with ecommerce, order, workflow and product management. I will also discuss the client side tools that we use (Open Office, Gimp, and Dia) and our plans for the future including clustering arrangements and high availability.

0 Comments:

Post a Comment

<< Home