Skip to main content

Posts

Showing posts from January, 2009

Web Crawler API

Japanese Crawler project is a research and development project was done by me in my company.It was a really challenge to me.My host is a Japanese patent site and want to crawling data from it.Normally web host protocols are base on www but amazing this one is www7 and www8.When i saw it first time, I'm confusing.However It was not impotent to my works.
First I was did some research about crawlers and bots before start the project design.First time I thought this is a common type web Crawler project and then i was download some sample crawler source code from sorceforge.After I was been running sample crawler source code giving my host(www7.xxxx.com).Ohhhhh There was a no out put only show one page called www7.xxx.ipdl.I can't believe that because it is working with other site like www.google.com.... etc.After I was understood, this is not a common type of web crawler project..
Then I want to work hard.First I…