Tag Archives: open APIs

Tech Tips From The Nonprofit Technology Conference

This article was first published on the Idealware Blog in May of 2010.

Last month, I reported on the first annual Tech Track, a series of sessions presented at the April, 2010 Nonprofit Technology Conference. In that post I listed the topics covered in the five session track. Today I want to discuss some of the answers that the group came up with.

Session 1: Working Without a Wire

This session covered wireless technologies, from cell phones to laptops. Some conclusions:

The state of wireless is still not 100%, but it’s better than it was last year and it’s still improving Major metropolitan areas are well covered; remote areas (like Wyoming) are not. There are alternatives, such as Satellite, but that still requires that your location be in unobstructed satellite range. All in all, we can’t assume that wireless access is a given, and the challenge is more about managing staff expectations than installing all of the wireless by ourselves. It will get there.
Wireless security options are improving. Virtual Private Networks (VPNs), remote access solutions (such as Citrix, VNC andTerminal Services) are being provided for more devices and platforms, and the major smartphone companies are supporting enterprise features like remote device wipes.
Policy-wise, more orgs are moving to a module where staff buy their own smartphones and the companies reimburse a portion of the bill to cover business use. Some companies set strict password policies for accessing office content; others don’t.

Session 2: Proper Plumbing

This session was pitched as covering virtualization and other server room technologies, but when we quizzed the participants, virtualization was at the top of their list, so that’s what we focused on.

We established that virtualizing servers is a recommended practice. If you have a consultant recommending it and you don’t trust their recommendation, find another consultant and have them virtualize your systems, because the recommendation is a good one, but it’s a problem that you don’t trust your consultant!
The benefits of virtualization are numerous — reduced budgets, reduced carbon footprints, instant testing environments, 24/7 availability (if you can upgrade a copy of a server and then switch it back live, an advanced virtualization function).
There’s no need to rush it — it’s easier on the budget and the staff, as well as the environment, to replace standalone servers with virtualized ones as the hardware fails.
On the planning side, bigger networks do better by moving all of their data to a Storage Area Network (SAN) before virtualizing. This allows for even more flexibility and reduced costs, as servers are strictly operating systems with software and data is stored on fast, redundant disk arrays that can be accessed by any server, virtual or otherwise.

Session 3: Earth to Cloud

The cloud computing session focused a lot on comparisons. While the general concern is that hosting data with a third party is risky, is it any more risky than hosting it on our own systems? Which approach is more expensive? Which affords the most freedom to work with our data and integrate systems? How do we manage disaster recovery and business continuity in each scenario?

Security – Everyone is hackable, and Google and Salesforce have a lot more expertise in securing data systems than we do. So, from a “is your data safe?” perspective, it’s at least a wash. But if you have sensitive client data that needs to be protected from subpoenas, as well as or more than hackers, than you might be safer hosting your own systems.
Cost – We had no final answers; it will vary from vendor to vendor. But the cost calculation needs to figure in more than dollars spent — staff time managing systems is another big expense of technology.
Integration and Data Management – Systems don’t have to be in the same room to be integrated; they have to have robustAPIs. And internal systems can be just as locked as external if your contract with the vendor doesn’t give you full access and control over your data. This, again, was a wash.
Risk Management – There’s a definite risk involved if your outsourced host goes out of business. But there are advantages to being hosted, as many providers offer multiply-redundant systems. Google, in particular, writes every save on a Google Doc or GMail to two separate server farms on two different continents.
It all boils down to assessing the maturity of the vendors and negotiating contracts carefully, to cover all of the risks. Don’t sign up with the guy who hosts his servers from his basement; and have a detailed continuity plan in place should the vendor close up shop.
 If you’re a small org (15 staff or less), it’s almost a no-brainer that it will be more cost-effective and safer to host your email and data in the cloud, as opposed to running our own complex CRMs and Exchange servers. If you’re a large org, it might be much more complex, as larger enterprise apps sometimes depend on that Exchange server being in place. But, all in all, Cloud computing is a viable option that might be a good fit for you — check it out, thoroughly.

I’ll finish this thread up with one more post on budgeting and change management in the next few weeks.

Why SharePoint Scares Me

This post was originally published on the Idealware Blog in July of 2009.

For the past four years or so, at two different organizations, I’ve been evaluating Microsoft’s Sharepoint 2007 as a Portal/Intranet/Business Process Management solution.  It’s a hard thing to ignore, for numerous reasons:

  • It’s an instant, interactive content, document and data management interface out of the box, with strong interactive capabilities and hooks to integrate other databases. If you get the way it uses lists and views to organize and display data, it can be a very powerful tool for managing and collaborating on all sorts of content.  As I said a year or two ago in an article on document management systems, it has virtually all of the functionality that the expensive, commercial products do, and they aren’t full-fledged portals and Intranet sites as well.
  • Sharepoint 2007 (aka MOSS) is not free, but I can pick it up via Techsoup for a song.
  • It integrates with Microsoft Exchange and Office, to some extent, as well as my Windows Directory, so, as I oversee a Windows network, it fits into it without having to fuss with tricky LDAP and SMTP integrations.
  • All pretty compelling, and I’m not alone — from the nonprofit CIO and IT Director lists I’m on, I see that lots of other mid to large-sized organizations are either considering Sharepoint, or already well-ensconced.

So, why does Sharepoint scare me?

  • What it does out of the box, it does reasonably well.  Not a great or intuitive UI, but it’s pretty powerful. However, advanced programming and integration with legacy systems can get really complicated very fast.  It is not a well-designed database, and integration is based on SOAP, not the far less complicated REST standard, meaning that having someone with a strong Microsoft and XML programming skill set on board is a pre-requisite for doing anything serious with it.
  • MOSS is actually two major, separately developed applications (Windows Sharepoint Services and Content Management Server) that were hastily merged into one app.  As with a lot of immature Microsoft products, they seem to have been more motivated by marketing a powerful app than they were in making it actually functional.  Sharepoint 2013 or 2016 will likely be a good product, kind of like Exchange 2007 or SQL Server 2003, but Sharepoint 2007 makes a lot of promises that it doesn’t really keep.
  • Sharepoint’s primary structure is a collection of “sites”, each with it’s own URL, home page, and extensions. Without careful planning, Sharepoint can easily become a junkyard, with function-specific sites littered all over the map.  A number of bloggers are pushing a “Sharepoint invites Silos“ meme these days.  I stop short of blaming Sharepoint – it does what you plan for.  But if you don’t plan, or you don’t have the buy-in, attention and time commitment of key staff both in and out of IT, then silos are the easiest things for Sharepoint to do.
  • The database stores documents as database blobs, as opposed to linking to files on disk, threatening the performance of the database and putting the documents at risk of corruption. I don’t want to take my org’s critical work product and put it in a box that could easily break.
  • Licensing for use outside of my organization is complicated and expensive. MOSS access requires two or three separate licenses for each user – a Windows Server licence; a Sharepoint License, and, if you’re using the advanced Sharepoint features, an additional license for that functionality.  So, if I want to set up a site for our Board, or extend access to key partners or clients, It’s going to cost for each one.  There’s an option to buy an unlimited access license, but, the last time I looked, this was prohibitively expensive even at charity pricing.
  • Compared to most Open Source portals, Sharepoint’s hardware and bandwidth requirements are significantly high. Standard advice is that you will need additional, expensive bandwidth optimizing software in order to make it bearable on a WAN. For good performance on a modest installation, you’ll need at least two powerful servers, one for SQL Server and one for Sharepoint; for larger installations, a server farm.

I can’t help but contrast this with the far more manageable and affordable alternatives, even if those alternatives aren’t the kitchen sink that Sharepoint is.  Going with a non-Microsoft portal, I might lose all of that out of the box integration with my MS network, but I would jettison the complexity, demanding resources, and potential for confusion and site sprawl.  I’m not saying that any portal/intranet/knowledge management system can succeed without cross-departmental planning, but I am saying that the risk of a project being ignored — particularly if the financial investment was modest, and Sharepoint’s not cheap, even if the software can be — is easier to deal with than a project being fractured but critical.

If my goal is to promote collaboration and integrated work in my organization, using technology that transcends and discourages silos, I’m much better off with apps like Drupal, KnowledgeTree, Plone, or Salesforce, all of which do big pieces of what Sharepoint does, but require supplemental applications to match Sharepoint’s smorgasbord of functionality, but are much less complicated and expensive to deploy.

After four years of agonizing on this, here’s my conclusion: When the product matures, if I have organizational buy-in and interest; a large hardware budget; a high-performance Wide Area Network, and a budget for consulting, Sharepoint will be a great way to go. Under the conditions that I have today — some organizational buy-in; modest budget for servers and no budget for consulting; a decent network, but other priorities for the bandwidth, such as VOIP and video — I’d be much better served with the alternatives.

XML, API, CSV, SOAP! Understanding The Alphabet Soup Of Data Exchange

This article was originally published at Idealware in October of 2007.

Let’s say you have two different software packages, and you’d like them to be able to share data. What would be involved? Can you link them so they exchange data automatically? And what do all those acronyms mean? Peter Campbell explains.

There has been a lot of talk lately about data integration, Application Programming Interfaces (APIs), and how important these are to non-profits. Much of this talk has focused on the major non-profit software packages from companies like Blackbaud, Salesforce.com, Convio, and Kintera. But what is it really about, and what does it mean to the typical org that has a donor database, a web site, and standard business applications for Finance, Human Resources and payroll? In this article, we’ll bypass all of the acronyms for a while and then put the most important ones into perspective.

The Situation

Nonprofits have technology systems, and they live and die by their ability to manage the data in those systems to effectively serve their missions. Unfortunately, however, nonprofits have a history of adopting technology without a plan for how different applications will share data. This isn’t unique to the nonprofit sector: throughout the business world, data integration is often underappreciated.
Here’s simple example: Your mid-sized NPO has five fundraising staff people that together bring in $3,000,000 in donations every year. How much more would you bring in with six fundraising people? How much less with four? If you could tie your staffing cost data to hours worked and donations received, you would have a payroll-to-revenue metric that could inform critical staffing decisions. But if the payroll data is in an entirely different database from the revenue data, they can’t be easily compared.
Similarly, donations are often tracked in both a donor database and a financial system. If you’ve ever had to explain to the board why the two systems show different dollar amounts (perhaps because finance operates on a cash basis while fund development works on accrual), you can see the value in having systems that can reconcile these differences.

How can you solve these data integration challenges? Short of buying a system that tracks every piece of data you may ever need, data exchange is the only option. This process of communicating data from one system to another could be done by a straightforward manual method, like asking a staff member to export data from one system and import it into another. Alternatively, automatic data transfers can save on staff time and prevent trouble down the road – and they don’t have to be as complex as you might think.
What does it take to make a data exchange work? What is possible with your software applications? This article explains what you’ll need to consider.

 

Components of Data Exchange

Let’s get down to the nitty-gritty. You have two applications, and you’d like to integrate them to share data in some way: to pull data from one into another, or to exchange data in both directions. What has to happen? You’ll need four key components:

  • An Initiating Action. Things don’t happen without a reason, particularly in the world of programming. Some kind of triggering action is needed to start the data interchange process. For an automatic data exchange, this is likely to be either a timed process such as a scheduler kicking off a program at 2AM every night, or a user action – for instance, a visitor clicking the Submit button on your website form.
  • A Data Format. The data to be transferred needs to be stored and transferred in some kind of logical data format – for instance, a comma delineated text file – that both systems can understand.
  • A Data Transfer Mechanism. If both applications reside on your own network, then a transfer is likely to be straightforward – perhaps you can just write a file to a location where another application can read it. But if one or both applications live offsite, you might need to develop a process that transfers the data over the internet.

Let’s look at each of these components in more detail.

 

Initiating Action

An initiating action is what starts things rolling in the data exchange process. In most cases, it would take one of three forms:

  • Human Kickoff. If you’re manually exporting and importing files, or need to run a process on a schedule that’s hard to determine in advance, regular old human intervention can start the process. An administrator might download a file, run a command line program, or click a button in an admin interface.
  • Scheduler. Many data exchanges rely on a schedule – checking for new information every day, every hour, every five minutes, or some other period. These kinds of exchanges are initiated by a scheduler application. More complex applications might have a scheduling application built-in, or might integrate with Windows Scheduler or Unix/Linux Chron commands.
  • End User Action. If you want two applications to be constantly in synch, you’ll need to try to catch updates as they happen. Typically, this is done by initiating a data exchange based on some end user action, such as a visitor clicking the Submit button on an online donation form.

 

 

Data Formats

In order to transfer data from one system to another, the systems need to have a common understanding of how the data will be formatted. In the old days, things were pretty simple: you could store data in fixed format text files, or as bits of information with standard delimiting characters, commonly called CSV for “Comma Separated Values”. Today, we have a more dynamic format called XML (eXtensible Markup Language).
An example fixed format file could be made up of three lines, each 24 characters long:

Name (20)  Gender(1)   Age(3)
Susan          f                    25
Mark             m                  37

 

A program receiving this data would have to be told the lengths and data types of each field, and programmed to receive data in that exact format.

 

“Susan”,”f”,25
“Mark”,”m”,37

CSV is easier to work with than fixed formats, because the receiving system doesn’t have to be as explicitly informed about the incoming data. CSV is almost universally supported by applications, but it poses challenges as well. What if your data has quotes and commas in it already? And as with fixed formats, the receiving system will still need to be programmed (or “mapped”) to know what type of data it’s receiving.
CSV is the de facto data format standard for one-time exports and data migration projects. However, automating CSV transfers requires additional programming – batch files or scripts that will work with a scheduling function. Newer standards, like XML, are web-based and work in browsers, allowing for a more dynamic relationship with the data sets and less external programming.
The XML format is known as a “self-describing” format, which makes it a bit harder to look at but far easier to work with. The information about the data, such as field names and types, is encoded with the data, so a receiving system that ‘speaks” XML can dynamically receive it. A simple XML file looks like this:

-<PEOPLE>
-<PERSON>
<NAME>Susan</NAME>
<GENDER>f</GENDER>
<AGE>25</AGE>
</PERSON>
-<PERSON>
<NAME>Mark</NAME>
<GENDER>m</GENDER>
<AGE>37</AGE>
</PERSON>

An XML friendly system can use the information file itself to dynamically map the data to its own database, making the process of getting a data set from one application to another far less laborious than with a CSV or fixed width file. XML is the de facto standard for transferring data over the internet.

Data Transfer Mechanisms

As we’ve talked about, an initiating action can spur an application to create a formatted. data set. However, getting that data set from one application to another requires some additional work.
If both of your applications are sitting on the same network, then this work is likely pretty minimal. One application’s export file can easily be seen and uploaded by another, or you might even be able to establish a database connection directly from one application to another. However, what if the applications are in different locations? Or if one or both are hosted by outside vendors? This is where things get interesting.
There are multiple ways to exchange data over the web. Many of them are specific to the type of web server (Apache vs. Microsoft’s IIS) or operating system (Unix vs Linux vs Microsoft) you’re using. However, two standards – called “web services” – have emerged as by far the most common methods for simple transfers: SOAP (Standard Object Access Protocol) and REST (Representational State Transfer).
Both SOAP and REST transfer data via the standard transfer protocol mechanism of the web: HTTP. To explain the difference between REST and SOAP, we’ll take a brief detour and look at HTTP itself.
HTTP is a very simple minded thing. It allows you to send data from one place to another and, optionally, receive data back. Most of it is done via the familiar Uniform Resource Identifier (URI) that is typed into the address bar of a web browser, or encoded in a link on a web page, with a format similar to:

http://www.somewhere.com?parameter1=something&parameter2=somethingelse

There are two methods built into HTTP for exchanging data: GET and POST.

  • GET exchanges data strictly through the parameters to the URL, which are always in “this equals that” pairs. It is a one-way communication method – once the information is sent to the receiving page, the web server doesn’t retain the parameter data or do anything else with it.
  • POST stores the transferred information in a packet that is sent along with the URI – you don’t see the information attached to the URI in the address bar. Post values can be altered by the receiving page and returned. In almost any situation where you’re creating an account on a web page or doing a shopping transaction, POST is used.

The advantage to GET is that it’s very simple and easy to share. The advantages to POST are that it is more flexible and more secure. You can put a GET URI in any link, on or offline, while a POST transfer has to be initiated via an HTML Form.
However, add to the mix that Microsoft was one of the principal developers of the SOAP specification, and most Microsoft applications require that you use SOAP to transfer data. REST might be more appealing if you only need to do a simple data exchange, but if you’re working with Microsoft servers or applications, it is likely not an option.

Transformation and Validation Processes

While this article is focused on the mechanics of extracting and moving data, it’s important not to lose sight of the fact that data often needs a lot of work before it should be loaded into another system. Automated data exchange processes need to be designed with extreme care, as it’s quite possible to trash an entire application by corrupting data, introducing errors, or flooding the system with duplicates.
In order to get the data ready for upload, use transformation and validation processes. These processes could be kicked off either before or after the data transfer, or multiple processes could even take place at different points in time. An automated process could be written in almost any programming language, depending on the requirements of your target applications and your technical environment.

 

  • Converting file formats. Often, one application will export a data file with a particular layout of columns and field names, while the destination application will demand another.
  • Preventing duplicates. Before loading in a new record, it’s important to ensure that it doesn’t already exist in the destination application.
  • Backup and logging. It’s likely a good idea to kickoff a backup of your destination database before importing the data, or at least to log what you’ve changed.
  • User interface. For complex processes, it can be very useful to provide an administrative interface that allows someone to review what data will change and resolve errors prior to the import
  • Additional Data Mining. If you’re writing a process that analyzes data, adding routines that flag unusual occurrences for review can be very useful. Or if you’re uploading donation data that also has to go to Finance, why not concurrently save that into a CSV file that Finance can import into their system? There are plenty of organizational efficiencies that can be folded into this process.

As described in the API section below, a sophisticated application may provide considerable functionality that will help in these processes.

Application Programming Interfaces (APIs)

What about APIs? How do they fit in? We’re hundreds of words into this article without even a mention of them – how can that be? Well, APIs are a fuzzy concept that might encompass all the aspects of data exchange we just discussed, or some of them, or none of them at all. Clear as mud, right?
An API is exactly what it says – an interface, or set of instructions, for interacting with an application via a programming language.
Originally, APIs were built so that third party developers could create integrating functions more easily. For instance, a phone system vendor might write specific functions into their operating system so that a programmer for a voice mail company could easily import, extract, and otherwise work with the phone system data. This would usually be written in the same programming logic as the operating system, and the assumption was that the third party programmer knew that language. Operating systems like Unix and Windows have long had APIs, allowing third parties to develop hardware drivers and business applications that use OS functions, such as Windows’ file/open dialog boxes.

 

APIs are written to support one or more programming languages – such as PHP or Java – and require a programmer skilled in one of these languages. An API is also likely to be geared around specific data format and transfer standards – for instance, it may only accept data in a particularly XML format, and only via a SOAP interface. In most cases, you’ll be limited to working with the supported standards for that API.

 

Choose Your Own Data Exchange Adventure

The type of data exchange that makes sense and how complex it will be varies widely. A number of factors come into play: the applications you would like to integrate, the available tools, the location of the data, and the platform (i.e Windows, Linux, web) you’re using. Integration methods vary widely. For instance:

  • Striped all the way down to the basics, manual data exchange is always an option. In this case, an administrator (a Human Kickoff initiating action) might download a file into CSV, save it to the network, perform some manual transformations to put it into the appropriate file format, and upload it into a different system.
  • For two applications on the same network, the process might not be too much more complex. In this case, a Scheduler initiating action might prompt one application to export a set of data as a CSV file and save it to a network drive. A transformation program might then manipulate the file and tell the destination application to upload the new data.
  • Many web-based tools offer simple ways to extract data. For instance, to get your blog’s statistics from the popular tracking service FeedBurner, you could use a scheduled initiating action to simply request a FeedBurner page via HTTP, which would then provide you the statistics on a XML page. Your program could then parse and transform the data in order to load into your own reporting application or show it on your own website. Many public applications, such as GoogleMaps, offer similarly easy functionality to allow you to interact with them, leading to the popularity of Mashups- applications that pull data (generally via APIs) from two or more website.
  • If you are using a website Content Management System which is separate from your main constituent management system, you may find yourself with two silos containing constituent data – members who enrolled on your web site and donors tracked in a donor database. In this circumstance, you might setup a process that kicks off whenever someone submits the Become a Member form. This process could write the data for the new member into an XML file, transfer that file to your server, and there kickoff a new process that import the new members while checking for duplicates.

Finding Data-Exchange-Friendly Software

As is likely clear by now, the methods you can use to exchange data depend enormously on the software packages that you chose. The average inclination when evaluating software is to look for the features that you require. That’s an important step in the process, but it’s only half of the evaluation. It’s also critical to determine how you can – or if you can – access the data. Buying into systems that overcomplicate or restrict this access will limit your ability to manage your business.
Repeat this mantra: I will not pay a vendor to lock me out of my own data. Sadly, this is what a lot of data management systems do, either by maintaining poor reporting and exporting interfaces; or by including license clauses that void the contract if you try to interact with your data in unapproved ways (including leaving the vendor).
To avoid lock-in and ensure the greatest amount of flexibility when looking to buy any new application – particularly the ones that store your data off-site and give you web-based access to it – ask the following questions:

  • Can I do mass imports and updates on my data? If the vendor doesn’t allow you to add or update the system in bulk with data from other systems, or their warrantee prohibits mass updates, then you will have difficulty smoothly integrating data into this system.
  • Can I take a report or export file; make a simple change to it, and save my changes? The majority of customized formats are small variations on the standard formats that come with a system. But it’s shocking how many web-based platforms don’t allow you to save your modifications.
  • Can I create the complex data views that are useful to me? Most modern donor, client/case management and other databases are relational. They store data in separate tables. That’s good – it allows these systems to be powerful and dynamic. But it complicates the process of extracting data and creating customized reports. A donor’s name, address, and amount that they have donated might be stored in three different, but related tables. If that’s the case, and your reporting or export interface doesn’t allow you to report on multiple tables in one report, then you won’t be able to do a report that extracts names and addresses of all donors who contributed a certain amount or more. You don’t want to come up with a need for information and find that, although you’ve input all the data, you can’t get it out of the system in a useful fashion.
  • Does the vendor provide a data dictionary? A data dictionary is a chart identifying exactly how the database is laid out. If you don’t have this, and you don’t have ways of mapping the database, you will again be very limited in reporting on and extracting data from the application.
  • What data formats can I export data to? As discussed, there are a number of formats that data can be stored in. You want a variety of options for industry standard formats.
  • Can I connect to the database itself? Particularly if the application is installed on your own local network, you might be access the database directly. The ability to establish an ODBC connection to the data, for instance, can provide a comparatively easy way to extract or update data. Consider, however, what will happen to your interface if the vendor upgrades the database structure.
  • Can I initiate data exports without human intervention? Check to see if there are ways to schedule exports, using built-in scheduling features or by saving queries that can be run by the Windows Scheduler (or something similar). If you want to integrate data in real time, determine what user actions you can use to kick off a process. Don’t allow a vendor to lock you out of the database administrator functions for a system installed on your own network.
  • Is there an API? APIs can save a lot of time if you’re building a complex data exchange. For some systems, it may be the only way to get data in or out without human intervention. Don’t assume any API is a good API, however – make sure it has the functions that will be useful to you.
  • Is there a data exchange ecosystem? Are there consultants who have experience working with the software? Does the software support third party packages that specialize in extracting data from one system, transforming it, and loading it into another? Is there an active community developing add-ons and extensions to the application that might serve some of your needs?

Back to Reality

So, again, what does all of this really mean to a nonprofit organization? From a historical perspective, it means that despite the preponderance of acronyms and the lingering frustrations of some companies limiting their options, integration has gotten easier and better. If you picked up this article thinking that integrating and migrating data between applications and web sites is extremely complex, well, it isn’t, necessarily – it’s sometimes as simple as typing a line in your browser’s address bar. But it all depends on the complexity of the data that you’re working with, and the tools that your software application gives you to manage that data.

 

For More Information

An Introduction to Integrating Constituent Data: Three Basic Approaches
A higher level, less technical look at data integration options

The SOAP/XML-RPC/REST Saga
A blog article articulating the differences – from a more technical perspective – between REST and SOAP.

Mashup Tools for Consumers
New York Times article on the Mashup phenomenon

W3 Hardcore Data Standard Definition
W3, the standards body for the internet. The hardcore references for HTTP, XML, SOAP, REST and other things mentioned here.

Web API List
Techmagazine’s recent article linking to literally hundreds of applications that have popular Web APIs

Peter Campbell is currently the Director of Information Technology at Earthjustice, an non-profit law firm dedicated to defending the earth. Prior to joining Earthjustice, Peter spent seven years serving as IT Director at Goodwill Industries of San Francisco, San Mateo & Marin Counties, Inc. Peter has been managing technology for non-profits and law firms for over 20 years, and has a broad knowledge of systems, email and the web. In 2003, he won a “Top Technology Innovator” award from InfoWorld for developing a retail reporting system for Goodwill thrift. Peter’s focus is on advancing communication, collaboration and efficiency through creative use of the web and other technology platforms. In addition to his work at SF Goodwill, Peter maintains a number of personal and non-profit web sites; blogs on NPTech tools and strategies at http://techcafeteria.com; is active in the non-profit community as member of NTEN; and spends as much quality time as possible with his wife, Linda, and eight year old son, Ethan.

Steve Anderson of ONE/ Northwest, Steven Backman of Design Database Associates, Paul Hagen of Hagen20/20, Brett Meyer of NTEN, and Laura Quinn of Idealware also contributed to this article

Data Exchange Article Up at Idealware

My article “XML, API, CSV, SOAP! Understanding the Alphabet Soup of Data Exchange” is up at idealware.org. This is intended as a primer for those of you trying to make sense of all of this talk about Application Programming Interfaces (APIs) and data integration. It discusses, with examples, the practical application of some of the acronyms, and suggests some recommended practices around data system selection and deployment. Credit has to go to Laura Quinn, webmaster at Idealware, who really co-wrote the article with me, but didn’t take much credit, and our reviewers, Paul Hagan, Steve Anderson and Stephen Backman, who added great insights to a pretty heady topic.

The article went through a lot of rewrites, and we had to cut out a fair amount in order to turn it into something cohesive, so I hope to blog a bit on some of the worthwhile omissions soon, but my day job at Earthjustice has been keeping me pretty busy.

How To Find Data-Exchange-Friendly Software

This article was co-written by Laura Quinn of Idealware and first published on the NTEN Blog in October of 2007.

 Peter Campbell, Techcafeteria, and Laura Quinn, Idealware

This is an excerpt adapted from Idealware’s article, “XML, API, CSV, SOAP! Understanding the Alphabet Soup of Data Exchange“.

Repeat this mantra: I will not pay a vendor to lock me out of my own data. Sadly, this is what a lot of data management systems do, either by maintaining poor reporting and exporting interfaces or by including license clauses that void the contract if you interact with your data in unapproved ways.

The software you choose has an enormous impact on whether you can effectively get data in or pull it out to integrate with other packages. If you only look at the front end features, you’re only conducting half an evaluation. It’s also critical to determine how you can — or if you can — access the data.

To avoid lock-in and ensure the greatest amount of flexibility when looking to buy any new application — particularly the ones that store your data off-site and give you web-based access to it — ask the following questions:

  1. Can I do mass imports and updates on my data? If the vendor doesn’t allow you to add or update the system in bulk with data from other systems, or their warrantee prohibits mass updates, then you will have difficulty smoothly integrating data into this system.
  2. Can I take a report or export file; make a simple change to it, and save my changes? The majority of customized formats are small variations on the standard formats that come with a system. But it’s shocking how many web-based platforms don’t allow you to save your modifications.
  3. Can I create the complex data views that are useful to me? Most modern donor, client/case management and other databases are relational. They store data in separate tables. That’s good – it allows these systems to be powerful and dynamic. But it complicates the process of extracting data and creating customized reports. A donor’s name, address, and amount that they have donated might be stored in three different, but related tables. If that’s the case, and your reporting or export interface doesn’t allow you to report on multiple tables in one report, then you won’t be able to do a report that extracts names and addresses of all donors who contributed a certain amount or more. You don’t want to come up with a need for information and find that, although you’ve input all the data, you can’t get it out of the system in a useful fashion.
  4. Does the vendor provide a data dictionary? A data dictionary is a chart identifying exactly how the database is laid out. If you don’t have this, and you don’t have ways of mapping the database, you will again be very limited in reporting on and extracting data from the application.
  5. What data formats can I export data to? As discussed, there are a number of formats that data can be stored in, such as CSV (Comma Separated Values – a great format for manual imports and exports) and XML (eXtensible Markup Language – better for automatic integration). You want a variety of options for industry standard formats.
  6. Can I connect to the database itself? Particularly if the application is installed on your own local network, you might be access the database directly. The ability to establish an ODBC connection to the data, for instance, can provide a comparatively easy way to extract or update data. Consider, however, what will happen to your interface if the vendor upgrades the database structure.
  7. Can I initiate data exports without human intervention? Check to see if there are ways to schedule exports, using built-in scheduling features or by saving queries that can be run by the Windows Scheduler (or something similar). Don’t allow a vendor to lock you out of the database administrator functions for a system installed on your own network. If you want to integrate data in real time, determine what user actions can kick off a process (for instance, clicking Submit on a web form).
  8. Is there an API? APIs can provide a very useful set of functions for importing, exporting, and moving data programmatically. For some systems, an API may be the only way to get data in or out without human intervention. Don’t assume any API is a good API, however – make sure it has the functions that will be useful to you.
  9. Is there a data exchange ecosystem? Are there consultants who have experience working with the software? Does the software support third party packages that specialize in extracting data from one system, transforming it, and loading it into another? Is there an active community developing add-ons and extensions to the application that might serve some of your needs?

Evaluating software packages for data exchange capabilities can’t be an afterthought. It’s too important. Buying into systems that over-complicate or restrict your access to data will limit your ability to manage your business, both today and as long as you own the package.

 

What happened?

Well, work happened, and I have to admit that I am not the driven blogger who can maintain a steady flow of posts while working full-time. I’ve been doing a consulting/contracting gig in San Jose that not only keeps me busy, but takes huge chunks out of my day for the commute, so my attention to Techcafeteria has suffered unduly. I’ll be wrapping up the work in San Jose and transitioning to a new, full-time position over the next month or two, returning to the ranks of Non-Profit IT Directors that I didn’t imagine I’d stay out of for long. More on that position later – I’ve been asked to keep it under wraps for a week or so.

So I’ll be closing the consulting services section of Techcafeteria, but I’ll be keeping the website going as time affords. It’s been an interesting year for me, so far. From 1986 until 2007, I held three jobs. I stayed at each one for at least six years, and I secured the next one before leaving the prior. I haven’t been unemployed (aka self-employed) for over two decades. But I have a bit of a self-imposed challenge – I want a job with deep business and technology challenges, at an organization with a worthwhile mission, at a pay scale that, while not extravagant, is enough to support my family living in the Bay Area, where my partner spends most of her time homeschooling our son. Those opportunities aren’t a dime a dozen. I reached a point early in the year where I was downright desperate to leave the job that I was at (a long story that I have no intention of relating here!), and applied at some for-profit companies. I think I sabotaged myself in the interviews, because it eventually became clear to me that having day to day work that combats social or environmental injustice is a personal requirement of mine. My partner supports this — she was proud to tell people that I worked for Goodwill and she’s even more excited about my new gig, which sports a killer tagline. So setting up the consulting practice was — and probably will be again — a means of staying solvent while I was very picky about what I applied for.

One job that I pursued was with an org called the Pachamama Alliance. They are a fascinating group of people. Their story is that the indigenous people of Ecuador put out a call for help to the Western World as they saw the earth and their culture being destroyed by the clearing of the rainforests. The group forming Pachamama answered that call, and their mission is to “change the dream of the western world” into one that is in harmony with nature, as opposed to dominance and disrespect of it. They maintain that environmental injustice and social injustice are tied at the knees – where you find one, you’ll find the other. For those of you who saw Gore’s “An Inconvenient Truth”, you’ll recall the fact that the main water source for the Sudan dried up a few years ago. That bit of trivia puts the subsequent genocide in Darfur in an interesting perspective. Pachamama has adopted Gore’s tactics with a multimedia presentation that both educates and inspires people to adopt a more sustainable dream. It’s a timely movement, as it’s becoming clear to all of us that our current rate of consumption of natural resources is having dramatic impacts on the environment. Pachamama spreads the word by training volunteers to share the presentation. Well worth checking out.

In other news, I’m hard at work on an article for Idealware that attempts to deflate all of this big talk about APIs and put it in terms that anyone can use to understand why they might want to migrate data and how they might do it. I’m also talking with my friends at NTEN about doing a webinar on the best practices for rolling out CRM at a non-profit. As long-time blog readers have probably picked up, I consider Constituent Relationship Management software to be the type of technology that, deployed correctly, completely alters the way a business is run. It’s not just about maintaining business relationships and tracking donors – it’s about working collaboratively and breaking down the silos of business relationships and data. So installing the software (if software even needs to be installed) is the least of it, and data migration is just a chore. But aligning business strategy to CRM technology is the real challenge.

So, I’ll post next week about my new gig, and look forward to a long life for Techcafeteria as a resource on non-profit technology, with less of the hawking of services.

Instant Open API with Rails 2.0

Day 2 at the Ruby on Rails conference – after the Keynote.

My main focus is on technology trends that allow us all to make better use of the vast amounts of information that we store in myriad locations and formats across diverse systems. The new standards for database manipulation (SQL); data interchange (XML) and data delivery (RSS) are huge developments in an industry that has traditionally offered hundreds different ways of managing, exporting and delivering data, none of which worked particularly well — if at all — with anybody else’s method. The technology industry has tried to address this with one size fits all options — Oracle, SAP, etc., offering Enterprise Resource Platforms that should be all things to all people. But these are expensive options that require a stable of high-paid programmers on hand to develop. I strongly advocate that we don’t need to have all of our software on one platform, but that all data management systems have to support standardized methods of exchanging information. I boil it all down to this:

It’s your data. Data systems should not restrict you from doing what you want to do with your data, and they should offer powerful and easy methods of accessing the data. You can google the world for free. You shouldn’t have to pay to access your own donor information in meaningful ways.

How can the software developers do this? By including open Application Programming Interfaces (APIs) that support web standards.

So what does this have to do with Ruby on Rails? At the Keynote this morning, David Heinemeier Hannson showed us the improvements coming up in Ruby for Rails 2.0. And he started with a real world example: an address book. Bear with me.

  1. He created the project (one line entered at a command prompt).
  2. He created the database (another line)
  3. He used Rails’ scaffolding feature to create some preliminary HTML and code for working with his address book (one more line).
  4. He added a couple of people to the address book.

At this point, with a line or so of code, he was able to produce HTML, XML, RSS and CSV outputs of his data. The new scaffolding in 2.0 automatically builds the API. I could get a lot more geeky about the myriad ways that Ruby on Rails basically insures that your application will be, out of the box, open, but I think that says it well.

Think of what this means to the average small business or non-profit:

  • You need a database to track, say, web site members, and you want to further integrate that with your CRM system. With rails, you can, very quickly, create a database; generate (via scaffolding) the input forms; easily export all data to CSV or XML, either of which can be imported into a decent CRM.
  • You want to offer newsfeeds on your web site. Create the simple database in Rails. Generate the basic input forms. Give access to the forms to the news editors. Export the news to RSS files on your web server.

This is powerful stuff, and, as I said, an instant API, meaning that it can meet all sorts of data management needs, and even act as an intermediary between incompatible systems. I still have some reservations about Rails as a full-fledged application-development environment, mostly because it’s performance is slow, and, while the keynote mentioned some things that will address speed in 2.0, notably a smart method of combing and compressing CSS and Javascript code, I didn’t hear anything that dramatically addresses that problem. But, as a platform, it’s great to see how it makes actively including data management standards a native output of any project, as opposed to something that the developer must decide whether or not to do. And, as a tool, it might have a real home as a mediator in our data integration disputes.

Seven Questions For Peter Campbell On Open APIs

This interview was conducted by Holly Ross and first published on the NTEN Blog in July of 2006.

What’s an Application Programming Interface (API)?

APIs are the code in any application that allows for the customization and migration of information in and out of the program’s data store.The API allows your application to interface with other systems in the same manner that a door or data line allows your home to interact with the world around it. APIs were originally developed in the telecom industry, as the need to have computer applications that integrated with telephone systems arose.The concept quickly expanded as a method for companies to merge information in their major systems, such as Finance, Human Resources Constituent Relationship Management (CRM).Common examples of APIs include: importing and exporting data in and out of donor databases; and merging data from multiple sources via the Web, such as gas prices overlapped with Google Maps.Sites that use Google or Yahoo!’s API to merge data are commonly called “web mashups.”

Why would a nonprofit use an Open API?

If you want to do a mailing and your constituents’ addresses lie in multiple systems (donor database, Outlook, Excel, Access), then an API could be used to quickly merge them into one address list.As grant reporting requirements become more stringent, funders want to know what percentage of labor goes into direct service versus overhead, what portion of supply expense is put directly to mission-related use, and what percentage of volunteer time was put to field versus office work.Generating this report requires integrating data from multiple data sources.An API can help automate that task.

What is an Open API, as opposed to a closed one?

An Open API is one that does not restrict you.It gives you full access to your data and the application interface to support your customized needs. A closed API restricts your ability to work with your data.The difference between open and closed APIs is one of degrees, not either/or.The less an application allows you to do with your data, the more closed it is.

Are Open APIs features of Open Source Applications?

Not necessarily.Open Source software comes with code and a license to modify it.An API is an intermediary set of rules that allows you to do customization and integration, even if the source code is encrypted, as with most commercial applications.

Why are Open APIs controversial within the software industry?

Customers often want software that will not lock them out as organizational needs grow and change.Software selection has to be tied to strategic planning, and products need to be adaptable to unforeseen needs.This doesn’t rule out purchasing Microsoft or other (relatively) closed systems, as there can be strategic and economic advantages in standardizing on a vendor.But you need to do so in full awareness of how that software platform will limit your integration and reporting.There are commercial products, such as Salesforce.com, that have wide open APIs, because Salesforce operates on the philosophy that they should not restrict their customers from using their own data as they choose.

How does a nonprofit use APIs if it doesn’t have technical staff with API skills?

There are many programming resources in the nonprofit technology community who will develop low-cost or free applications that work with the API (again, see Salesforce as an example – the AppExchange is a rich collection of free, low-cost add-ons, with many targeted at our community).Look at the work being done with CivicSpace/CivicCRM to support APIs and integration.Even if the actual use of the API is not an in-house function, the existence of an API for a product or web service is still critical.

Should nonprofits advocate for the availability Open APIs among software firms that serve the nonprofit sector?

As funders and constituents demand more accountability from nonprofits, and as nonprofits want to better operate our businesses, it’s more important than ever to have commercial applications that are open to integration with APIs.In the nonprofit community, standardizing on one platform and developing it (the Enterprise Resource Planning (ERP) approach) is often far too ambitious – we lack the funding and resources to make that huge an IT investment.So key to our ability to operate in the information age is our ability to integrate data between our various applications.It’s important that nonprofit leaders encourage software firms that serve the nonprofit sector to make Open APIs available.

Peter Campbell is the Director of Information Technology at Goodwill Industries of San Francisco.