Here is an enormous Yahoo Pipe. You can see it in it’s full size by clicking here Giant Yahoo Pipe.
I’m going to use it to explain how to perform many different data manipulations using Yahoo Pipes. Just so you know, Yahoo Pipes is used to pull data from websites, manipulate that data, create an RSS feed with that data, and then send it where ever you would like. For more information on Yahoo Pipes see my tutorial Yahoo Pipes Tutorial.
In this article I’ll show you:
And, a bunch more!
Remember you can open the full screen version of the Yahoo Pipe to follow along.
Using the Yahoo Search Module in Yahoo Pipes
Here I’m using the 2 User Input modules (URL Input & Text Input) to pass text to the Yahoo Search Module. I then am filtering those results, to only return articles that contain the word star. I pull the full article from the website and place that in the feed description. To finish it all off, I truncate the results to the top 5 results. This is what it looks like.
Yahoo Search & User Input Modules
The Yahoo Search Module takes 2 inputs. The URL you want to restrict the search to. (This is optional) It also excepts the keywords you want to search for. It will then output the results, sorted in the order of relevance to the search term.
I’m allowing the Yahoo Pipe user to enter the chosen URL to search through, with the URL Input Module. You can define the order in which it is listed by typing a value into the Position field. I also chose the Huffington Post as the default URL to search.
Similarly I used the User Text Input Module to enter the chosen keyword, or I used the word “Fed” as a default.
Yahoo Pipe Filter Module
I then defined that I want to block any articles in which the description contains the word “star”. I could also only permit articles with the word “star”. Searches can be further refined by adding additional rules and demanding that descriptions contain all of the words I list. You can filter based off of whether the words lie in numerous locations such as the title (item.title), description (item.description), link to original article (item.link), etc.
Yahoo Pipe Truncate Module
After all of the articles are gathered I then use the Truncate Module to eliminate all of the articles except for the top 5 relevant ones. You do this with the Truncate module.
Fetch a Feed & Fetch a Foreign Feed & Translate it
The next Yahoo Pipe modules will be used to:
This is what it would look like:
What Am I Doing Here?
RSS Feeds are great, but not great for posting to your automated website. To get the whole article, instead of a mini description, just combine the Fetch Feed, Loop and Sub String Modules. Here is what to do:
Wrapping Up the Giant Yahoo Pipe
The Unique Module
After every feed is combined into one, I send them all to the Unique Module. This module will in this case check for duplicate titles. That is probably the only option, of those available, that will eliminate duplicate articles in the feed.
The Sort Module
I use the Sort Module, to sort all of the feeds based off of publication date, in ascending order. You can sort based off of other factors, but I believe this is the most common option chosen.
The Sub String Module
The Sub String Module, will trim the articles to a maximum length of 200 characters, in this example. You could choose to make the length of the description larger or smaller with it.
The String Builder Module
Here I’m creating a Read More link at the end of the article that will point back to the original article. This is considered good web etiquette and is required by most sites online. The String Builder Module allows you to build custom strings on the fly. To create the Read More tag I just enter:
That’s All Folks
There is a giant Yahoo Pipe that demonstrates how to use most of the modules and how they are often used. If you have any questions leave them in the comment section below.
Till Next Time
Think Tank
Sir, i found your tutorials very useful.
I’m trying to use fetch page module to create custom RSS, but i have basic knowledge of HTML. I was able to split the page into no of contents, and then used the Rename module to copy the contents as link, title, description and pubDate as all these are available within the contents.
I’m stuck after this point, how do i trim off the unwanted data from title, pubDate, link and Description. I know my solution is in regex module.., can you please tell me the symbols that i should use to search & replace the contents with respective item data (something like feed43.com, i hope u understand what i mean)
Thanks,,,eagerly waiting for reply
I created an article on Regular Expressions that I believe will solve your issue. It is available here http://www.newthinktank.com/2010/04/javascript-scripting-tutorial-part-3/
If you need a more specific solution, just tell me and I’ll do my best to answer it. Thank you 🙂
Awesome!!!
Another Awesome tutorial from you 🙂
Learned New things today.
Hi Derek, will it be possible to use Yahoo! Pipes along with amazon.com affiliates. Thus making a Pipe which show feeds for a particular Keyword product in such a way that the amazon affiliate code automatically gets added to each product thus Making a Au
There are a few things that will be hard to over come. Amazon doesn’t create an RSS feed for all of it’s pages. Also Yahoo Pipes has been really odd recently. It seems to block my feed on an off if it generates to much traffic. I’m about 80% done with a program built in Python that automates posting. You can see that in the Python tutorials here. When I get it completely done I’ll make all the code freely available to everyone. Check out the Python tutorial till I get to that. Thanks a bunch
Sir, Thanks for your great tutorial. i have a problem. i want to know how to change/remove item.link to title as this point to original feed source.
I am waiting for your response.
Thanks
Deepak
I’m not sure I understand your question. item.link refers to the link to the original article from the feed. You would use it to pull information from the original article and not just the description found in the feed. item.title is a reference to the title of the article located in the feed. Please provide more information on what you are trying to do and I’ll do my best to help you out. Thanks
I have created a yahoo pipe, and publish this with my favorite wp plugin(feedwordpress). When i see published post, “Post title” point to orginial source of website. i want to change this with my own title. how to do this??
Have you tried piping it through the Pipe Regex? This provides you with a great deal of editing capabilities? How do you want to change the titles? Do you want to append the same words every time? That would be easy
hello
I am trying to use a feed, take the first 400 words of the article and translate it.
I have succeded to take the whole article using Loop+Fetch page but:
1. I don’t know how to limit only the first 400 words of the article
2. the translation doesn’t seem to work on the whole extracted article. It only work on title and on the excertp of the article send by thr rss feed
Can you help?
Could you send me a link to your pipe? I’ll take a look at it. If your interested I created a bunch of tutorials on web site scraping as well. How to Scrap Web Sites Thanks
it has 2 variants
1. is working well
2. with fetch page doesn’t translate the content, only the title
http://pipes.yahoo.com/pipes/pipe.info?_id=8b2d17d9b868f23261da205c65b15f96
For the second I also don’t now how to cut from: to: + first 400 words
To grab the full length article you need to connect to the link associated with the title in the feed. The problem with that is that Pipes seems to not like it when you do this. This will make the feed you create with pipes become unavailable from time to time. I’m looking into whether there is a solution to this issue.
thank you for your time.
I’m waiting here an answer.
I have set up the Yahoo pipes tool to grab the whole article and unlike before it fails to create the rss feed. They may be blocking people from taking anymore than a small part of the original article. I’ll report back to you if I can find away around this. Here is Python code you can use to grab that information just like with pipes Python Website Scraping Thanks
Pals, thanks for amazing tutorial, I’m new about pipes. May you help me how to delete video from RSS feed? I have got the content but I don’t need the video.
Thanks for your kind help and nice to meet you and your blog… 😉
Hi,
I am truly impressed by your work, wow!
On my site I’m trying to insert a yahoo pipes to be able to take from this page http://www.comune.spoleto.pg.it/elenco_comunicati (which has no feeds) and extract the contents of the table on the page (using the Fetch Page module).
Then, then, given that the content in the table would not have the pubDate field and therefore I would not have the ability to sort contents by date, I could use your tutorial to extract the full content of each news, divide the part description from the pubdate and create the final with the sort to pubDate.
What I ask you is: Could you help me build the first part of the pipe, the one that extracts the table from the page using Fetch Page?
Sorry, but I just can not find solutions …
Thanks anyway
Carlo
I hope that I explained
Hi Carlo
I would show you how to use Yahoo Pipes if I thought it would work consistently for you. Yahoo seems to have abandoned Pipes and nothing seems to work consistently now. I have a ton of tutorials using PHP for website scraping that I assure you will work. Just look up web site scraping and regular expressions on my site. I have finished scripts that you can download.
hai…ist possible to combine 2 article from 2 feed to be one article?
You could with some php dabbling, but I can’t think of a way to do that with just pipes
could you plz give me an example source..i try google and could not find any relevance source…
I’m not sure what you are looking for?
thank you for the tutorials
hi i’m trying to remove all tags but I want the images to remain how do I do that
Hi its a great article..
wanna asking.. how to merge 2 feed into 1, example: in 1st feed there’s a contain “iam” and then in the 2nd feed there’s contain “handsome” , how can u put it together into 1 feed with contain “iam handsome” ??
any advice please mail me?
Thanks, and sory my bad english 🙂
Hi, I followed this below
The descriptions I built with this Yahoo Pipe followed by the link to the original article
…
Read More
but i cant seem to have it working
I can see the “Read More” text but it does have the link behind it…
Hope you can help.
Here is my pipe
http://pipes.yahoo.com/pipes/pipe.edit?_id=257db5c731d8d963f852a43d078fdfb8
Thanks a lot..
I mean the link behind the “Read More” is not working. I followed you string builder module but mine is not working.
Please help. Here is my pipe:
http://pipes.yahoo.com/pipes/pipe.edit?_id=257db5c731d8d963f852a43d078fdfb8
Hi, I have created a very simple pipe to pull in a feed from twitter. When I run pipe it publishes the result as a URL and then again below it as text, so it’s duplicated. I only want it as a text so I can re tweet it. For an example go to Yahoo Pipes.
Fetch feed
http://search.twitter.com/search.atom?q=+%22kruger%22+OR+%23kruger+park%23
to Pipe output and run pipe to see what I mean.
Any help will be appreciated.
I agree with remy etienne the link does not work, if your works give us the link of the pipe please. I’m trying to solve this problem for a long time.
Sorry, but many services using Yahoo pipes have been deprecated. It seems like Yahoo isn’t supporting it
Awesome tutorials!
Thank you 🙂