techcafeteria

Techcafeteria Blog

Regular (Expression) Magic

Let’s get a bit geeky. Many Idealware visitors come here for advice on purchasing and deploying data management systems, such as donor databases, constituent relation management systems and content management systems. And, more often than not, are replacing older systems with new ones, meaning that one of the trickiest tasks is data migration. If any of this work has ever fallen to you, then you might have found yourself doing tedious editing and corrections in Excel, pouring over data screens or rows in Access trying to formalize non-formalized data entry, and generally settling for some lost or incorrect data moving from old system to new.

Wouldn’t it be great to have a magic wand that can instantly reformat the data to the proper format? Well, I have one for you. But, just as Harry Potter had to go to school before he could effectively wave his wand, mine comes with a lesson or two as well.

The wand in question is a search/replace language called regular expressions. Regular expressions are a set of terms that can be used, in supported software, to perform advanced search and replace functions. They were originally popularized in the Unix Stream Editor (SED), but are now standardly found in text editors, word processors, scripting languages (such as PHP) and other software, usually as an advanced option.

The reason to use them instead of a regular search and replace function is simple: they can search for things that regular search tools can’t. For example:

  • the first three characters at the beginning of each line

  • the three at the end of each line

  • one or more spaces

Regular expressions can also do multiple replacements in one phrase, allowing you to either remove the first comma encountered in a sentence, or all commas. Here are the basics:

A regular expression takes the form of /Search Phrase/Replacement/. A simple search to replace all instances of the word “fish” with the word “bird” would look like:

/fish/bird/

But regular expressions only prove their worth when you learn their special characters:

. (any character)

  • (one or more characters)

^ (the beginning of a line

$ (the end of a line)

() (parentheses surrounding characters in the search phrase can be recalled in the replacement)

$1, $2 (substitute in the replacement for characters saved by parentheses in the search phrase)

(backslashes treat the next character literally, even if it’s a Regular expression special character)

[a-z], [0-9], [A-Za-z] (groupings search for all of the characters specified between the brackets, using dashes to identify ranges

Examples:

If you have a text printout of a document that you want to whittle into something more useful, like a CSV file, step one might be to remove any dead space.

/ */ /

will search for one or more spaces (the asterisk means “any number of the preceding character) and replace them with one space.

/^$/d

will remove all blank lines (lines with nothing between the beginning and the end of the line)

If you are moving data from one system to another, you might have to reformat dates for the new system. Say the old system exports dates as MM/DD/YYYY and the SQL database you’re importing them to expects YYYY-MM-DD. This Regular Expression will convert all dates to the new format:

/([01][0-9])/([0-3][0-9])/([12][0-9][0-9][0-9])/$3-$1-$2/

Let’s break this down:

/ – a slash starts the search phrase section.

( – parentheses surround things that we want to remember, so this starts a section we’ll remember.

[01][0-9] – a month (MM) will be a number between 1 and 12, so, if your system is exporting dates with leading zeros (if not, you can do this with a series of regular expressions to get around that), then the [01] set will match either a leading zero or a one. The [0-9] set will match any digit following that one or zero.

) – this will be remembered in the replacement as $1, because it’s the first thing we remembered.

/ – since the slash is a regular expression special character (the delimiter), we precede it with a backslash, telling the parser to treat it a a slash, not a delimiter.

([0-3][0-9]) – this will find any pair of numbers between 01 and 39, which we know as the day, and remember it as $2, because it’s enclosed in parentheses.

/ – next slash

([12][0-9][0-9][0-9]) – this catches the year. You see how, right? It is specifying that the year will be in this millennia or the last by limiting the first character to one or two. We use parentheses to remember this as well.

/ – this slash signifies that the search phrase is done, and the replacement will follow.

$3-$1-$2 – this takes our three remembered phrases and reorders them from month, day, year to year ($3), month ($1), day ($2), placing dashes in-between them.

/ – finally, we close the command with a slash.

One of my standard uses is to take a list – which could be an Excel spreadsheet, or a database dump, or a Word table—clean it up, and then format it into SQL statements that can then be pulled into a database. Most databases can import in CSV files, but Excel, while good at doing some reformatting, can’t do the fancy cleanup tasks that my regular expression-enabled editor can. Once my specific clean-up chores are done, if I’m left with a tab-delimited file, I can do the following three simple searches to turn it into a SQL input file that can just be run in my SQL interpreter.

/t/’,’/—searches for all tabs (t is a symbol that means “tab”) and replaces them with ‘,’

/(.)$/$1’);/ – searches for the last character in a line and replaces it with that character followed by a close quote, close parens and semi-colon.

/^(.)/insert into players (name, title, company) values (‘$1/ – searches for the first character in any line and prepends the front end of the SQL statement.

If we had an input file with lines like this:

Joe NamathQuarterbackForty-niners

It would become

insert into players (name, title, company) values (‘Joe Namath’,’Quarterback’,’Forty-niners’);

There are plenty of excellent resources for learning about regular expressions on the web, but many of them are targeted at programmers, making them a bit thick to read through. For more friendly introductions, I recommend The regular-expressions.info quickstart. While many text-processing tools, including Microsoft Word, support regular expression search and replace, I recommend using a good text editor over a word processor, because it will likely include supporting functionality, such as block copying/pasting, and they’ll handle very large files with far more speed and grace. I’ve been happy using TextPad and EditPlus on Windows, and TextMate and TextWrangler on the Mac. Wikipedia publishes an incomplete list of applications that include regular expression functionality.

Share/Save/Bookmark

NPTech.Info Updated

NPTech Aggragator at http://nptech.info

Those of you familiar with my sideproject at http://nptech.info know that it has been trustworthily aggregating blog entries, photos and websites tagged with the term “nptech” for close to four years now.  It’s been a little negelcted of late, but after Annaliese over at NTEN gave it a shout-out, I figured it was due for some clean-up. Here’s what’s new:

  • About 25 blogs added to the NPTech Blogs section, and a broken link or two corrected on the existing ones;

  • Information from Twitter added to the main “Tagged items” feed that already grabs nptech items from Delicious, Flickr and Technorati;

  • New additions to the general tech section from sites like ReadWriteWeb and Mashable

  • A simple Facelift, primarily adding a little color and going for a more attractive font (fancy design is not a big priority here, particularly since my last big effort to pretty it up got creamed in a Drupal upgrade).

As usual, if you have a blog focused on Non-Profit Technology that you’d like added to the mix, let me know, but rest assured that, if you can find your blog on Technorati, we’re already grabbing the items that you tag or categorize as “nptech”.

Share/Save/Bookmark

NTC (Just) Past and Future

Photo by Andrew J. Cohen of Forum1Photo by Andrew J. Cohen of Forum1

Here it is Saturday, and I’m still reeling from the awesome event that was the Nonprofit Technology Conference, put on by org of awesomeness NTEN. First things first, if you attended, live or virtually, and, like me, you not only appreciate, but are pretty much astounded by the way Holly, Anna, Annaliese, Brett and crew get this amazing event together and remain 100% approachable and sociable while they’re keeping the thing running, then you should show your support here.

We had 1400 people at the sold-out event, and if that hadn’t been a capacity crowd, I’m pretty sure we had at least 200 more people that were turned away. What does that say about this conference in a year when almost all of us have slashed this type of budget in response to a dire economic situation? I think it says that NTEN is an organization that gets, totally, and phenomenally, what the web means to cash-strapped, mission-focused organizations, and, while we have all cut spending, sometimes with the painful sacrifice of treasured people and programs, we know that mastering the web is a sound strategic investment.

Accordingly, social media permeated the event, from the Clay Shirky plenary, to the giant screen of tweets on the wall, and the 80% penetration of social media as topic in the sessions. As usual, I lit a candle for the vast majority of nonprofit techies who are not on Twitter, don’t have an organizational Facebook page, and, instead, spend their days troubleshooting Windows glitches and installing routers. My Monday morning session, presented with guru Matt Eshleman of CITIDC, was on Server Virtualization. If you missed it, @jackaponte did such a complete, accurate transcription, and you can feel like you were there just by reading her notes (scroll down to 10:12) and following along with the slides.

My dream—which I will do my best to make reality—is that next year will include a Geek Track that focuses much harder on the traditional technology support that so many NPTechs need. I stand on record that I’m willing to put this track together and make it great!

I was also quite pleased to do a session on How to Decide, Planning and Prioritizing, based on my chapter of NTEN’s book, Managing Technology to Meet Your Mission.  It was really great to start the session with a question that I’ve always dreamed I’d be able to ask: “Have you read my book?”.  I’m in debt to NTEN for that opportunity!

The biggest omission at this event (um, besides reliable wifi, but what can you do?) was the addition of a twitter name space on our ID badges. Twitter provided a number of things to the—by my estimation—half of the attendees who hang out there.

  • Event anticipation buildup, resource sharing, session coordination and  planning, ride and room sharing and other activities were all rife on Twitter as the conference approached.

  • Session tweeting allowed people both in other sessions and at home to participate and share in some of the great knowledge shared.

  • For me, as a Twitter user who has been on the network for two years and is primarily connected to NTEN members, Twitter did something phenomenal. Catching up with many of my “tweeps”, we just skipped the formalities and dived into the conversations. So much ice is broken when you know who works where, what they focus on in their job, if they have partners and/or kids, what music tastes you share, that catching up in person means diving in deeper. The end result is clear—#09ntc is still an active tag on Twitter, and the conference continues there, and will continue until it quietly evolves into #10ntc.

One thing, however, worries me. This was the tenth NTC, my fifth, but it was the first NTC that the online world noticed. Tuesday, on Twitter, we were the second most popular trend (the competing pandemic outranked us). NTEN’s mission is to help nonprofits use technologies to further their missions. But, as said above, this conference was, in many ways, a social media event. I’m hoping that Holly and crew will review their registration process next year to insure that early spots in what is sure to be an even more popular event aren’t filled up by people who really aren’t as committed to changing the world as they are to keeping up with this trend.

But, concerns aside, we need to send that team to a week-long spa retreat, and be proud of them, and proud of ourselves for not only being a community that cares, but being one that shares. I urge even the most skeptical of you to jump on the Twitter bandwagon, we’re not on there discussing what we had for breakfast. We’re taking the annual event and making it a perpetual one, with the same expertise sharing,  querying, peer support and genuine camaraderie that makes the nptech community so unique – and great. Come join us!

Share/Save/Bookmark

More RSS Tools: Sharing Feeds

For my last followup to my RSS article, Using RSS Tools to Feed your Information Needs, I want to discuss OPML, the standard for RSS Reader feed information, and talk a bit about why RSS, which is already quite useful, is about to become an even bigger deal. Last week, I discussed sharing research with Google Reader; before that, filtering RSS feeds with Yahoo! Pipes, and I started with a post about integrating content with websites.


Admitting that I might represent an extreme case, I subscribe to 96 feeds in Google Reader. I started with Google Reader last December – prior to that, I used a Mac RSS Reader called Vienna. Moving from Vienna to Google Reader might have been a chore, but it wasn’t, thanks to Outline Processor Markup Language (OPML). The short story on OPML is that it was developed as a standard format for outlining. While it is used in that capacity, it’s more commonly used as a format for collecting a list of RSS feeds, with last read pointers, that can then be processed by other feed-reading software. So, I exported all of my feeds from Vienna to a .opml file, then I imported that into Google Reader, and all of my feeds were instantly set up. If you run a Wordpress blog, you can rapidly build your blogroll by importing an .opml file.


In addition to sharing feed information with applications, OPML can be used to share a group of feeds with a co-worker, friend or constituent. Say your org does advocacy on a particular issue, and you’ve collected a set of feeds that represent the best news and commentary on your issue. You could make the OPML file available on your web site for your constituents to incorporate in their readers.


At this point, you might be saying to yourself, “what are the odds that my constituents even know what a feed reader is? Wouldn’t making this available be more likely to confuse than help people?” As good as a question as that is, here’s why I think that you won’t be asking it soon. RSS has seen quick and steady adoption as a standard web service. Four years ago, it was obscure; today every content management system and web portal supports it. It features prominently in the strategic plans of tech giants like Google, Microsoft and Yahoo!. But it’s not as well-known by the general computing public—RSS still has yet to become a real household concept, like search and email have. The game-changer is underway, though. Last month, The Seattle Post-Intelligenser, one of Seattle’s primary daily papers, ceased print publication. The San Francisco Chronicle announced last month that they are making one last ditch effort, with a redesign and new printing presses, to stem the growing budget deficit that they face. Competition from TV and the web is driving newspapers out of business, and the hope that something will reverse this trend is thin.


As the internet becomes the primary source of news and opinion, RSS is a natural fit as the delivery medium. You can see that all of the Seattle PI sections are available as RSS feeds, and they have an option to customize the news and features that you see on your homepage. How long before they offer your customized paper as an OPML file, allowing you to instantly replicate your web experience in a reader?


In 1995, internet email was an arcane, technical concept. I figured out that I could send mail to an Internet address using my company’s MCI Mail account. My email address was 75 characters long. RSS may seem similarly oblique today, but it’s well on the road to being a mainstream method of internet information delivery. Your partners and constituents won’t just appreciate your support for it; they’ll start to expect it. I hope that my article and these follow-ups in the blog can serve as a good starting point for understanding what RSS can do and what you might do with it.

Share/Save/Bookmark

More RSS Tools: Using Google Reader for Research and Sharing

Google Reader gets a good mention in my RSS article, Using RSS Tools to Feed your Information Needs, but deserves an even deeper dive. This is a follow-up to that article, along with my recent posts on Integrating content with websites, and Managing Content with Pipes. We’ve established that an RSS Reader helps you manage internet information far more efficiently than a web browser can; and we’ve talked in the last few posts about publishing feeds to your web site. This post focuses on using tools like Google Reader to share research .

Out of the box, GReader (as it’s affectionately known) is a powerful, web-based reader that lets you subscribe, mark and share items in two significant ways. Shared Items are items that get published to a public page that you can point your friends and co-workers to. Further, this page can be subscribed to via RSS as well, so it can be republished to your web site, or integrated into a Facebook feed. Using (fake) bill 221b as an example, if you monitor for and selectively share articles related to the bill, you can easily point co-workers and constituents to your shared page, and or republish those items in places where your audience will see them.

Shared Items are also made available to other GReader users who choose to share with you. This offers a greater level of convenience for teams working with shared research; it can also afford a level of confidentiality if you don’t want to publicize a public page. Not only can you share the items you find; you can also tag them, much like you would with Delicious or Flickr, and add a note, if you have thoughts or context-setting notes to share. A function recently added GReader takes this even further – shared items can be commented on, much as a blog post can.

The last bit to add to this arsenal is a very powerful, but not terribly obvious GReader feature. The Note in GReader bookmarklet (which you can drag to your web browser’s quick links or bookmarks toolbar from the GReader “Notes” page) lets you share, with comments and tags, pages that you find on the web as GReader shared items. So if you run across something that isn’t in your feeds (and there’s plenty of web content that can’t be subscribed to), this lets you add it to your shared items.

What I’ve found is that, as much as I admire social bookmarking sites like Delicious, they become a lot less useful when I can store all of the pages that I find via RSS or browsing, with tags and an option to share them, in the same convenient place.

It’s important to note that, as powerful as all of this is, it still lacks some functionality that similar tools have. One great advantage of using Delicious as a link-sharing tool is that you can share links specific to any tag (or set of tags). Google Reader doesn’t offer multiple shared pages based on filtering criteria. And while you can add notes to your feed (without adding links), it’s not as flexible a repository as a tool like Evernote, which lets you save web pages, ODFs and all sorts of documents to a single web-based folder.

Also, Google Reader isn’t the only game in town. The Newsgator family of RSS readers offer similar sharing functions; some of which overcome the limitations above, as do other readers out there (please share your favorite in the comments).

What it boils down to, though, is that we now have powerful, integrated options for online research, as individuals, as teams, and as information agents for our constituents. The convenience of publishing as you discover is a significant advancement over earlier schemes, which usually involved either sending a lot of easily-lost links by email, or submitting your finds to a webmaster, who would then add them to a page on your site. This is a publish as you find approach that incorporates sharing and communication into the research process.

Next week, I’ll finish up the “More RSS Tools” series with a post about OPML, the way that you make your collection of feeds portable.

Share/Save/Bookmark

More RSS Tools: Managing Content with Pipes

I’m continuing with follow-up topics from my RSS article, Using RSS Tools to Feed your Information Needs. Last week, I discussed integrating content with websites, and this week I’m going to dive into one of the more advanced ways to work with RSS content. This gets a little geeky, but it really shows off some of the sophistication of this technology.

The article provides numerous examples of RSS sources, but all in the form of web sites, blogs and web services that offer you one or more streams of information. If you want to narrow your view beyond the feeds available on a site, say, because you are only interested in Idealware posts about CRM tools or the ones written by Steve Backman, then you need a tool that will refine your search. Alternatively, you might want to put a section containing news stories relevant to a particular issue on your site, but want some control over the sources, as well as the subject matter. For this amount of control over the content you retrieve, you want to use something like Yahoo! Pipes.

Pipes is an RSS mashup editor. It’s a tool that looks a bit like Microsoft’s Visio, where you drag boxes onto a grid and draw relationships between them. But it’s not a layout or flowcharting tool; instead, it’s a visual mapping and filtering tool that lets you identify sources and then apply rules to those sources before merging them into an aggregated feed. To break that down, let’s say that your goal is to either monitor talk about a bill, or, maybe, to publish a section on your web site titled “What they’re saying about bill 221b” (I made that bill up). You have identified eight blogs that have good posts on the subject, and these are blogs that you trust to properly represent the issues and not, in any way, malign or confuse your efforts.

In Pipes, you can select all eight as sources, and then set up a filter to block any posts that don’t reference “221b”. The resulting RSS feed—which you can then subscribe to our republish— will isolate the posts that are relevant to the bill from your selected sources.

For example, here’s that pipe that will allow you to skip Michelle, Heather, Paul, Laura, Eric and my posts and just see Steve’s:

Picture 2.png

Another, more advanced example: You have an organizational Twitter feed that you want to republish to your site But you only want to publish your posts, not your individual replies. In Twitter, a reply is always identifiable by the very first character, which will be an “@” sign. Twitter RSS items arrive in the format “yourtwitterid: tweet”, so any reply will start with “yourtwitterid: @”. Setting up a Yahoo Pipe filter to block any result with “: @” in the text will isolate your posts from the replies. You can add a “Regex” (e.g. Search/Replace) command to replace “yourtwittername:” with nothing in order to publish just the tweet. The pipe will look like this:

Picture 1.png

If you play with Pipes (Yahoo! ID required, otherwise free), I highly recommend starting with an example like mine or this one by Gina Trapani to get the feel of it. Save your pipe, and you can subscribe to it—it updates automatically, and you don’t have to make it public for it to work.

Google has it’s competing Google Mashups tool in private beta, and similar tools are popping up all over the web. I talk a lot about how RSS is the technology that allows us to manage the information on the web. Pipes let us refine it. It’s great stuff.

Look for more RSS talk on OPML files and Google Reader in my upcoming posts.

Share/Save/Bookmark

Feed Fight

LinkedIn has Facebook envy, and Facebook has Twitter envy. Ignoring MySpace (my general recommendation), these are three big social networks that, sadly, seem to be trying to co-opt each others strengths rather than differentiate themselves.  Per Readwriteweb, LinkedIn is jealous of Facebook’s page views, and is looking for ways (like applications) to keep users connected to the web site.  More noticeably, Facebook’s recent failed attempt to buy Twitter was followed up by a redesign that makes Facebook much more like Twitter.  Al of this inter-related activity has created some confusion as to what one should or shouldn’t do where, and a question as to whether this strategy of co-opting your neighbors’ features is a sound strategy.

My take is that each of these networks serve different purposes, and, while I am connected to a lot of the same people on all three, they each have distinct audiences and the communication I do on these networks is targeted to the individual networks.

  • LinkedIn is a business network. This is a place where potential employers and business associates are likely to go to learn about me.  Accordingly, I sparingly use the status update feature there, and never post about what movie I took the kid to or how funny the latest XKCD strip was.

  • Facebook is a casual network where I have some control over who sees my posts; it’s also the place where I find the most old friends and family. So, given that my potential employers and business associates aren’t likely to see my profile unless they have a personal or more collegial relationship already established with me, this is where I’ll give a status review of the Watchman movie or post a picture of the kid.

  • For me, Twitter is the business casual network, where my nptech peers gather to support each other and shmooze.  I am mindful that my tweets paint a public picture, so I keep the ratio of professional to personal tweets high and I don’t say things that I wouldn’t want my wife or boss to see on the web.

The multiple, overlapping networks create some issues in terms of effective messaging.  One is the echo chamber effect – it’s ridiculously easy to automatically feed your tweets to Facebook and LinkedIn.  The other is the lack of ability to do more than broadly address numerous audiences.  I mean, my Facebook friends include co-workers, business associates, childhood friends and Mom; you’re probably in a similar boat.  For some people, this creates the “I really didn’t want Mom to hear about the party I attended last night” issue.  For most of us, it simply means that we don’t want to bore our old friends and family with our professional blogging and insights, any more than we really want our co-workers to see what sort of hippies we were when we were 17.

So I manage some of this by using Tweetdeck as my primary Twitter client, because the latest version lets me, optionally, send a status update to Facebook as well as Twitter, which I do no more than once a day with something that should be meaningful to both audiences.  What I won’t do (as many of my Facebook/Twitter friends do) is publish all of my tweets to Facebook—that’s cruel to both the friends who don’t need to see everything you tweet and the ones who are already seeing what you tweet on Twitter.

At first, I thought the idea of Facebook incorporating Twitter might be a good one.  Facebook has a big advantage over Twitter.  It’s hard to be new to Twitter; the usefulness and appeal are pretty muted until you have a community that you communicate with.  Facebook starts with the community, so it solves that problem.  But, for me, the amount of control I have over the distribution has a lot to do with the messaging, and I like that Twitter is completely public, republishable, and Google-searchable.  I communicate (appropriately) in that medium; and if you aren’t interested in what I want to communicate, I’m really easy to drop or ignore.  But my Mom is probably far less interested in both non-profit management and Technology than my Twitter followers, and I don’t want her to unfriend me on Facebook.  So I’d rather let Facebook be Facebook and let Twitter be Twitter.  Just because an occasional beer hits the spot, as does an occasional glass of wine, that doesn’t mean that I want to mix them together.

Share/Save/Bookmark

More RSS Tools: Web Site Integration

Those of you who visit pages besides the blog here at Idealware have noted that my article Using RSS Tools to Feed your Information Needs is up. If you’re new to Really Simple Syndication, my hope is that my guide will help you become more efficient and effective in your use of the web. If you’re an old hand at RSS, then I’m hoping the article will serve as a good tool when trying to impress others of the value of syndication.

RSS is a big topic, and writing the article was, in one respect, a challenge: in order to write a solid, intermediate guide to RSS use, I had to narrow the scope a bit. My initial interest and eventual obsession with RSS was sparked by two things: The overall usefulness of a tool that brings the web info I’m interested in to me; and the possibilities of using RSS as a publishing platform. So the article covers the first use well, but omits many cool things, like RSS Pipes, OPML, web site integration, and aggregators/portals. I hope to take these on over the next few weeks here in the blog.

Let’s start with web site integration. If you manage a web site, then you know that the name of the game is fresh content. While RSS will not eliminate the need to actively maintain your site, it can supplement your content in an automatically refreshing stream, as well as serve as a publishing medium.

If your site is built with a content management system (CMS), then you are probably already most of the way there. Most CMS’s have built in RSS aggregators that allow you to select the relevant content and publish it to a section of your site. If it isn’t a standard feature of your CMS, then browse the catalog of add-ons and extensions and you’ll probably find it there. Of course, if you use a commercial CMS, as opposed to an open source product, you might have to pay more for the add-on.

If you don’t have a CMS, a minimal amount of PHP scripting expertise can accomplish the same thing by using pre-built RSS functions libraries like Magpie RSS. Magpie is a set of PHP routines that you copy to your web server, allowing you to write minimal, simple code that identifies the feed and publishes it to a page. the heavy lifting is done by the Magpie—all you do is reference the feed and format the appearance of the items.

The simplest use is in republishing content on the web that’s pertinent to your site. You can aggregate news relevant to your cause, or sample topics of related interest from blogs on the web. For an example, look at the nonprofit technology news aggregator that I set up at nptech.info. This uses Drupal’s built-in RSS aggregator to create a three-section web site republishing nptech blogs, items tagged “nptech” on the web, and general technology news.

But it doesn’t stop there—if you post open positions on Craigslist, you can eliminate the need to also update your web page by simply subscribing to a search for your open jobs. The strategy here is in using RSS not only to add content, but to maintain content that currently requires a Webmaster’s attention. If you post your events to a site like Upcoming.org, your events page can be a simple RSS feed. If you link to related sites and associates, you can automate that as well by setting up an account at a bookmarking site, such as Delicious, tagging sites that you want to be linked to your web site with a unique tag, and then subscribing to that tag. And this concept works just as well for graphical content at Flickr, or videos at Youtube.

I’ll be posting soon about additional ways to manage RSS feeds, and I want to take a deeper dive into Google Reader, which takes it all to another level. In the meantime, if you have great stories about integrating RSS feeds with your web site, please share in the comments.

Share/Save/Bookmark

Now that Mom’s on Facebook…

...here’s what I want to write on her wall:

Dear Mom, welcome to Facebook!  I’m glad you’re here, because we don’t talk enough, and this is an opportunity to be a little more present in each other’s lives.  Mind you, it won’t, and shouldn’t, replace any phone calls or visits.

Facebook is a bit like taking the big, wide, Internet, and narrowing it down to just the stuff that your friends would show you.  It’s nice because we get to catch up with a lot of old and new friends in one place, but that same convenience also makes it a bit superficial.  Since almost everything you say on Facebook is shared with all of your friends, you’ll be saying things that you don’t mind everyone hearing,  That puts a bit of a filter on some of the meaningful exchanges that are so much a part of our true friendships.

Another big thing about Facebook is that it is the product of a private company; not a big, amorphous set of connections like the Internet at large.  And, since it’s “free”, the business model is advertising.  So Facebook is a business that makes money off of your interests and relationships. If that doesn’t sound just a little bit scary to you, I think it should.

So here are some great things to do and some things to avoid on Facebook:

  • Connect with people you know (ignore requests from people that you’ve never met!)
  • .

  • Share links to useful information, but stop short of sharing stuff that says more about your personal interests than you would want the world to know.
  • Ignore most of the applications.  Our friends and family are, in general, serious and active people who don’t have time to speculate on which of their Facebook friends they would like to be trapped on a desert island with.  I routinely ignore all of the non-existent gifts and requests to do things that I really don’t have any time to do, and, fortunately, my friends take the hint and stop bothering me with them.
  • Keep in mind that, every time you include a friend in an application invite, you’re telling the company that made the application about them.  So it’s not just that so many of these things are insanely trivial—they’re also potentially nebulous.
  • Don’t go crazy joining groups.  Every time you join a group, you open your profile to all of the members of that group.  It’s better to try and contain your exposure to people that you are fairly certain you would want to know.
  • Finally, you have my email address – send me personal mail there, not via Facebook’s mail.  While the mail is useful for establishing communication with people you reconnect with, and the wall writing is fun because you share it with others and can start conversations, I much prefer keeping our personal communication in my regular email.

To my mind, Facebook is a fun place to catch up with old friends and share things with my community, but if I only know someone on Facebook, let’s face it, they’re not really a friend.  Friendship implies a level of intimacy that shouldn’t be subject to broad peer review and data mining for advertisers.  And Facebook should not be a place that you can’t forget to visit for a week, or more, without risking offending someone.  Used moderately, with moderate expectations on the part of youa nd your Facebook friends, it has its rewards.

The world is coming to Facebook – it’s not just my Mom; it’s also my Dad, sister, brother-in-law, co-workers, grade school friends, and an assortment of people from everywhere in my life.  What do you want to say to the people you’re connecting with?  Leave a comment!

Share/Save/Bookmark

Both Sides Now

Say you sign up for some great Web 2.0 service that allows you to bookmark web sites, annotate them, categorize them and share them. And, over a period of two or three years, you amass about 1500 links on the site with great details, cross-referencing—about a thesis paper’s worth of work. Then, one day, you log on to find the web site unavailable. News trickles out that they had a server crash. Finally, a painfully honest blog post by the site’s founder makes clear that the server crashed, the data was lost, and there were no backups. So much for your thesis, huh? Is the lesson, then, that the cloud is no place to store your work?

Well, consider this. Say you start up a Web 2.0 business that allows people to bookmark, share, categorize and annotate links on your site. And, over the years, you amass thousands of users, some solid funding, advertising revenue—things are great. Then, one day, the server crashes. You’re a talented programmer and designer, but system administration just wasn’t your strong suit. So you write a painful blog entry, letting your users know the extent of the disaster, and that the lesson you’ve learned is that you should have put your servers in the cloud.

My recent posts have advocated cloud computing, be it using web-based services like Gmail, or looking for infrastructure outsourcers who will provide you with virtualized desktops. And I’ve gotten some healthily skeptical comments, as cloud computing is new, and not without it’s risks, as made plain by the true story of the Magnolia bookmarking application, which recently went down in the flames as described above. The lessons that I walk away with from Magnolia’s experience are:

  • You can run your own servers or outsource them, but you need assurances that they are properly maintained, backed up and supported. Cloud computing can be far more secure and affordable than local servers. But “the cloud”, in this case, should be a company with established technical resources, not some three person operation in a small office. Don’t be shy about requesting staffing information, resumes, and details about any potential off-site vendor’s infrastructure.
  • You need local backups, no matter where your actual infrastructure lives. If you use Salesforce or Google, export your data nightly to a local data store in a usable format. Salesforce lets you export to Excel; Google supports numerous formats. Gmail now supports an Offline mode that stores your mail on the computer you access it from. If you go with a vendor who provides virtual desktop access (as I recommend here), get regular snapshots of the virtual machines. If this isn’t an over the air transfer, make sure that your vendors will provide DVDs of your data or other suitable medium.
  • Don’t sign any contract that doesn’t give you full control over how you can access and manipulate your data, again, regardless of where that data resides. A lot of vendors try and protect themselves by adding contract language prohibiting mass updates and user access, even on locally-installed applications. But their need to simplify support should not be at the expense of you not having complete control over how you use your information.
  • Focus on the data. Don’t bend on these requirements: Your data is fully accessible; It’s robustly backed up; and, in the case of any disaster, it’s recoverable.

Technology is a set of tools used to manage your critical information. Where that technology is housed is more of a feature set and financial choice than anything else. The most convenient and affordable place for your data to reside might well be in the cloud, but make sure that it’s the type of cloud that your data won’t fall through.

Share/Save/Bookmark