Build An Aggregation Site With Drupal (Part 1)

This tutorial will be split into three parts - part 1 (this part) will explain how to set up the aggregation and import feeds, part 2 (to be published next post) will explain setting up cron to handle auto updating the feeds and will also cover using views to create some different site sections, and part 3 (to be published the post after that) will explain how to theme everything. In the tutorial I will be building a Drupal based sports news aggregation site, but you can obviously tailor this to whatever type of news items you'd like.

The goals:

  • Create an aggregation site which aggregates RSS feeds and outputs them in river of news style pages with the most recent news items first.
  • Create some different site sections (football and baseball) which only show news items related to that topic.
  • Allow users to filter news items by source (e.g. ESPN, BBC etc.).
  • Create RSS feeds of our aggregated pages which are available for our users.

You can check out the finished aggregation site (part 1 + part 2) here.

The set up:
For this tutorial I'll be using the following:

  • A clean install of Drupal 5.10 (using Garland)
  • SimpleFeed 5.x-2.2
  • Views 5.x-1.6

A quick word on SimpleFeed vs other aggregation modules:
There are a number of other aggregation modules available for Drupal. From my own experience the two best are SimpleFeed and FeedAPI. FeedAPI has excellent functionality and can do some very cool stuff (for example, check out this video on drupaltherapy.com which shows how to use FeedAPI and feed element mapping). However, in this case I've chosen to use SimpleFeed because I don't require any of this extra functionality and SimpleFeed is, well, the simplest to use.

Step 1: Set up the site and modules
Set up your Drupal site and then download and install the SimpleFeed module and the Views module. Select the following options for each module:

Drupal SimpleFeed and Views module options

Step 2: Install the missing simplepie.inc file
In order for SimpleFeed to work correctly it requires that we place the simplepie.inc file from the SimplePie library into our SimpleFeed module directory. If you currently go to your status report (admin/logs/status) you'll see the following error alerting you to this fact:

Drupal SimpleFeed SimplePie missing error message

So, to sort this out do the following:

  • Go to simplepie.org and download the latest version of SimplePie by clicking on the big 'Download' button (at the time of writing this is SimplePie version 1.1.1).
  • Extract the contents of the download, which will create a folder named 'SimplePie 1.1.1'.
  • Open this folder, locate the simplepie.inc file, and copy it into your SimpleFeed module folder. So you should have 'sites/all/modules/simplefeed/simplepie.inc'.

Now when you check your status report page you should not see any errors.

Step 3: Set up a vocabulary
In order to theme our news items more effectively, and to help us with filtering and sorting news items, we're going to assign taxonomy terms to them. SimpleFeed includes auto-assign functionality for taxonomy terms which will be very helpful here.

First, go to the 'Add vocabulary' subsection of the 'Categories' admin section (admin/content/taxonomy/add/vocabulary). Then create a vocabulary with the following settings:

  • Vocabulary name: Source
  • Types: check 'Feed' and 'Feed Item'
  • Check 'Free tagging'

Drupal vocabulary settings

Step 4: Configure your SimpleFeed settings
There are a few places where we can configure settings for SimpleFeed:

  • SimpleFeed settings page (admin/settings/simplefeed)
  • Access control 'simplefeed module' settings (admin/user/access)
  • 'Feed' content type settings page (admin/content/types/feed)
  • 'Feed Item' content type settings page (admin/content/types/feed-item)

SimpleFeed settings page
The SimpleFeed settings (admin/settings/simplefeed) are fairly self explanatory. In this
case we will use the following settings:

Drupal SimpleFeed settings

'Discard feed items older than:'
we always want our feed items to be available to users so we set this to 'Never'.

'Check feeds every:'
1 hour is good here as sports news is quite frequent, so we want to check for updates
often.

'Default input format:'
we'll start be leaving this as the default 'Filtered HTML' option which will filter out
any HTML tags that are not specified in the input format settings (admin/settings/filters).
However, with RSS feeds you can find that unclosed tags in a feed item will have an
adverse effect on the rest of your page so you may need to remove further tags options
after some testing.

'Vocabulary'
set this to 'Source' which was the vocabulary we set up in the previous step. This will
allow feed items to automatically inherit their parent's taxonomy terms.

'Automatically add categories set by external feeds to the vocabulary above'
we will leave this unchecked as we want to tightly control the taxonomy.

'Cron throttle'
the default 50 will be plenty for now!

Access control 'simplefeed module' settings
For this tutorial we're not going to change anything here, but you may want to depending upon the site usage.

'Feed' content type settings page
By default both the 'Feed' content type and the 'Feed Item' content type have their 'Default comment setting:' option set to 'Read/Write'. In this case we don't want users commenting on either so we'll change them to 'Disabled'.

To do so, first go to the 'Feed' content type settings page (admin/content/types/feed) and
scroll down to the 'Workflow' section. Then just change the 'Default comment setting:'
option to 'Disabled'.

Drupal SimpleFeed settings

'Feed Item' content type settings page
Do the same as for the 'Feed' content type and set the 'Default comment setting:' option to
'Disabled'.

Step 5: Find some RSS feeds
As this is going to be a sports news aggregation site I've gathered together the following RSS feed URLs and chosen a taxonomy term for each:

Just remember to check the terms of use sections on each of the websites regarding the republishing of feed content.

Step 6: Add your RSS feeds
Now everything is set up, let's add some feeds!
Go to 'Create content > Feed'. We'll start by adding the BBC sport front page feed with the following settings (the feed URL and taxonomy term are taken from Step 5 above):

Drupal - add a SimpleFeed feed

We've already taken care of all of the other settings (like input format, comment settings), so hit 'Submit'. The feed will be created and you should get the following confirmation screen:

Drupal - SimpleFeed feed created

Now we need to actually import the feed items.

We could wait for our cron job (which we'll set up in part 2 of the tutorial) to fire and trigger the auto update of the feed for us, but for now we'll do it manually.

So, click on the 'Refresh this feed' link and SimpleFeed will look for, and import, the new feed items. SimpleFeed imports each feed item as a node. After a second or two it should have found the new items, created the nodes, and output the following success message:

Drupal - SimpleFeed feed items created

One quick thing to note here is the first line 'The directory files/cache_simplefeed has been created'. If file permissions for this folder are not set correctly on your server it can cause cron errors, but we'll deal with this later on if it's a problem.

Finished! (Part 1)
If you now check the front page of your site you should see all of the feed items from the BBC front page RSS feed. Go ahead and add the other feeds the same way as the BCC feed (using the feed URLs and taxonomy terms from Step 5 above). The site won't do a lot yet, though, so we'll sort that out in part 2.

Also in this series
Build An Aggregation Site With Drupal (Part 2), where we set up cron to handle auto updating the feeds and also use views to create some different sports sections (football and baseball) and RSS feeds.

Coming soon...
In part 3 we'll get to theming everything.

15 comments

1
ScifiguyAugust 19th 2008 @ 11:36PM

Looks great! I can't wait for you to finish with the whole series! :-)

Is there a reason you did not use the FeedAPI and chose SimpleFeed instead?

2
GuestAugust 19th 2008 @ 11:43PM

this is very timely - I want to learn how to do exactly this, without a lot of trial-and-error time cost. thanks so much! looking forward to Part 2 and Part 3, with bated breath...

3
GuestAugust 19th 2008 @ 11:59PM

any chance you could write a bit about workflow options, node queue, etc. I would like to do something like this, but not automatically publish the nodes, but rather put them into moderation.

Looking forward to reading more, thanks for taking the time to write this up.

4
LeonidAugust 20th 2008 @ 02:10AM

I must admit that I was surprised to see a new guide/tutorial that is based on Drupal 5 and not on the latest Drupal 6. Is there a specific reason for that? Are there some obstacles with any of the modules to deliver the same functionality with Drupal 6?

I've been also trying to aggregate feeds with an emphasis on feeds with images and media files (videos). I’m working with Drupal 6. Last night I have tried a combination of FeedAPI and Embedded Media Field modules, but it seems it’s not possible without the Feedapi_Mapper module, which has not been ported to Drupal 6 yet.

Are you planning to cover how to display images and/or videos referenced by the feed items in your guide?

Anyway, I’m looking forward to parts 2 and 3.

Thank you.

5
LaurenceAugust 20th 2008 @ 12:06PM

Hi everyone, thanks for the comments so far.

@ Scifiguy - yes it was due to the requirements for the site. I talked about the decision to use SimpleFeed, as opposed to FeedAPI, near the beginning of the tutorial - check that out for more detail.

@ Guest (comment #3) - I hadn't planned on this but it sounds like a good idea. I think what I will do is gather a list of extra suggestions from people whilst I publish the first 3 parts and then cover them all in a follow up 4th part post.

@ Leonid - regarding using Drupal v5 instead of v6 you kind of answered the question already. SimpleFeed does not have a Drupal 6 site ready version available yet, so I went with Drupal 5.
As for displaying images and videos - I won't be covering this in the tutorial. You really need to use FeedAPI (as you already are) to do that kind of thing more easily.

6
BrentAugust 20th 2008 @ 01:29PM

Thank you for doing this series! Looking forward to seeing how you will do it.

@ Leonid As far as I know, neither FeedAPI nor Simplefeed have support for Views 2 yet in Drupal 6. That adds a big hiccup to this whole process of displaying things back. I'm sure it's coming but it is not there yet.

7
BrentAugust 21st 2008 @ 07:31PM

@Leonid Quick correction: FeedAPI just got the code for View 2 support in the Drupal 6 branch committed and it will be in future releases soon. http://drupal.org/node/238851

8
KateSeptember 21st 2008 @ 01:22AM

Looking forward to making this work for a site I have, but I ran into a little problem. I don't see the URL option anywhere on my feed page. I went to 'Create content > Feed' as you show in step 6, but it's not there. Funny, the URL field is there when I go to 'Create content > Feed Item'. Any ideas?

9
peterSeptember 24th 2008 @ 01:43AM

I'm having the same issue as Kate. Perhaps its because I'm using the Drupal 6 version? This page seems to indicate that it doesn't work... but there is a dev release on the simplefeed page.

10
TomOctober 18th 2008 @ 12:00AM

Yes, I have also the same problem like Kate. Using Drual 5.11 with simplefeed 5.x-3.1 and there is no URL field. No way to create a feed without URL. Any ideas?

11
N8November 24th 2008 @ 08:01PM

Can you explain the RSS feeds taxonomy a little more? Can't you just add the feeds Title and URL?
I am ok up until this part, then I get a PAGE CANNOT BE FOUND error 8(

12
N8November 24th 2008 @ 08:56PM

So after you create the "Source" vocabulary, do you add your RSS feed URLs and taxonomy term for each feed to this vocabulary?
Does this make sense?! I am noob! 8<(

13
dnevni horoskopJanuary 12th 2009 @ 05:36PM

I'm newbie in drupal,but this post is very clear and thnx for that.

14
vesti vijestiJune 4th 2009 @ 01:32PM

Great tutorial.It 's been very useful to my work on my Drupal site.

15
Stanovi TrebinjeJuly 28th 2009 @ 01:09PM

Very good tutorial.It help me to build my own rss feeds site.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
  • You may post code using <code>...</code> (generic) or <?php ... ?> (highlighted PHP) tags.

More information about formatting options

3 + 2 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.