Thursday, February 16, 2006

Topic Maps and RDF scutters (assertion spidering bots): State of the art?

I am looking for the state of the art in Topic Maps and RDF scutters (assertion spidering bots).
Do you have any useful pointers/hints?

The growing use of semantic knowledge technologies like RDF and Topic Maps should result in larger collections of represented assertions available on the internet. Scutters (information agents spidering such assertions) could collect and integrate them.

One example of such an assertion scutter is:
http://rdfweb.org/topic/Scutter
http://rdfweb.org/topic/ScutterVocab

I have sketched the idea in my blog entry dated 24th Nov., 2005.
http://asigel.blogspot.com/2005/11/ideas-for-aggregation-of-distributed.html

Do you have any information concerning the following four questions:

(1)
Which available collections of statements/assertions do you know and can you recommend to me for an aggregation scenario?


I want to use them in a content aggregation scenario where statements about the same subjects are collocated. Ideally, such a collection would use Published Subjects (or subject indicators). I am particularly looking for topic map data, but would also like to know about RDF data, since accorcing to the latest guidelines in semantic interoperability, useful mappings are possible between Topic Maps and RDF.

MusicBrainz is one example for a semantic web service with RDF.
There are also e.g. approaches for converting genealogical data (GEDCOM) to RDF FOAF,
or one might use DMOZ RDF data.

(2)
Which scutters (spidering information agents for RDF and/or topic maps
(or fragments) do you know/can you recommend?

I am planning to use:
http://search.cpan.org/~kjetilk/RDF-Scutter/
a LWP agent based on RDF::Redland.

Does something similar already exist for Topic Maps?

I know of some Java agents,
in particular the CC-licenced
Slug: A Simple Semantic Web Crawler (December 09, 2004) http://www.ldodds.com/blog/archives/000167.html
http://aloo.gnomehack.com/~ldodds/projects/slug/javadoc/

SECO contains an
RDF Crawler: Scutter (Bash and Pyhton for Scuttering) http://triple.semanticweb.org/svn/aharth/2004/wwwnyc/seco-talk.html
http://www.harth.org/andreas/2004/ieeeis/
SECO: mediation services for semantic Web data Harth A IEEE Intelligent Systems, (USA) May/Jun 2004, Vol 19 No 3, 66ff.
Harth and Gassert describe a 103 MB test data set they compiled:
On Searching and Displaying RDF Data from the Web http://sw.deri.org/2004/12/derisearch/Eswc2005Demo.pdf

In addition, researchers in SNA (Social Network Analysis) write scutters.
The data set compiled e.g. by PhD student Peter Mika is impressive:
Social Networks and the Semantic Web
http://doi.ieeecomputersociety.org/10.1109/WI.2004.10039

There exists a Redfoot-RDF-Scutter in Python:
http://redfoot.net/scutter/
for which a REST interface has been proposed:
Sun, 29 Jan 2006
A RESTful Scutter Protocol for Redfoot Kernel http://copia.ogbuji.net/blog/2006-01-29/A_RESTful_

There is a Javascript extension for Mozilla:
Scuttering Composite RDF Datasource
http://nachbaur.com/software/mozilla/objects/index.xhtml

In his research proposal "Mining the Semantic Web", Ajay Chakravarthy in section 2.4 names some existing tools http://www.dcs.shef.ac.uk/~ajay/reports/Research%20Proposal.pdf
(Ontotext, Hackdiary, others with poor performance)
HyperSpider - HyperSpider (Java app) collects the link structure of a website. Data import/export from/to database and CSV-files. Export to Graphviz DOT, Resource Description Framework (RDF/DC), XML Topic Maps (XTM), Prolog, HTML. Visualization as hierarchy and map.
http://hyperspider.sourceforge.net/
(I could export website interlinkings with this, but this is formal metadata)

A List of RDF Crawlers
http://www.dbis.informatik.uni-frankfurt.de/~tolle/RDF/RDFReferences.html
(4 entries)
* RDF Crawler (in Java) from Institute AIFB, University of Karlsruhe, Germany
* Decentralised and reliable resource discovery using RDF metadata (also known as Fydra)
* DAML Crawler
* RDF Crawling Services - RDF Gateway
LuMriX which is topic map-based contains a crawler, but I know not enough about it.
http://www.lumrix.de/xmlsearch_keyfacts.php

(3)
Which sites freely offer semantic web services?
I want to retrieve assertions, i.e. fragments of knowledge networks realized with Topic Maps or RDF.
Preferably with a possibility to retrieve by published subject (or subject indicator).
Indirect search by name where I assert the identity of the subject might do for the moment.

(4)
Do you know of demo sites which can be externally queried with TMRAP 0.2 (or higher: 1.0, 2.0)?


Scratchpad of additional references:
(not yet checked)
------------------------------------

Current State of Semantic Web Mining
http://www.fernuni-hagen.de/DVT/Aktuelles/zhao_yi.pdf
Check starting slide 38, but not so useful for this purpose
Ontobroker, which includes
an ontology-based web-crawler

DefineCrawler
http://www.lalic.paris4.sorbonne.fr/stic/octobre/octobre/apr/Nauer.pdf

RDFWeb notebook: aggregation strategies
http://rdfweb.org/2001/01/design/smush.html
(describing Swoogle)

Finding and Ranking Knowledge on the Semantic Web http://www.cs.umbc.edu/~ypeng/Publications/2005/iswcLiDing.pdf

Search on the Semantic Web
http://www.cs.umbc.edu/~ypeng/Publications/2005/IeeeSemanticWebSearch.pdf

JNotes. Automatic Generation of Semantic Networks
http://www.jnotes.de/JNotes/jnotes_webware.nsf/0/2DC6FB39AE566557C12570EC00307C3B?openDocument

[xtm-wg] Sketch of a Possible Algorithm for Fragment Grabbing (2000) http://lists.oasis-open.org/archives/topicmaps-comment/200007/msg00018.html

Pragmatic applications of the Semantic Web using SemTalk.
The agents are supported by crawlers searching proactively or after request for existing models to generate index files for the agents. The crawlers do not only look in the local filesystem, but also in the Semantic Web, for available knowledge sources in the RDFS format.
http://www.semtalk.com/pub/KnowTech2001.htm

Metadata-based Web Querying
http://www.cs.bilkent.edu.tr/~ismaila/research_projects.htm

RDFStore
Perl/C RDF storage and API
http://rdfstore.sourceforge.net/

CARA is an RDF API written in Perl
http://cara.sourceforge.net/

---

39 Comments:

Anonymous Anonymous said...

Who knows where to download XRumer 5.0 Palladium?
Help, please. All recommend this program to effectively advertise on the Internet, this is the best program!

12:33 PM  
Blogger demetriapauley0116 said...

xvideo免費影片.免費影片觀看.go2av亞洲東洋影片.免費影片.g8非常好色電影院.av 短片.一葉晴貼影片區 av127.HOTSEE 限制級總站.南人家族聊天室.5278成人色論.69av女優成人影片.都都成人站.論壇男人的最愛.情色視訊成人交友.後宮免費視訊.成人影城-情色影片.杜蕾斯成人.夜未眠成人.bt成人下載av.硬弟弟免費影片.aaaaaaa片.視訊美女engin.成人視訊mela.無碼a片.免費a片.qq美美色網免費看.aaaaaa片俱樂部.情人趣味愛蜜莉.5278cc免費影片.77p2p 影片網愛田.sex888免費a片.日本 a 片商.hi5 tv 免費影片.85x1x成人影院.亞洲東洋影片.85cc成人片.免費影片直播網.3388影片區.sex999免費影片.免費無碼影片.杜雷斯成人免費影片.0204movie免費影片.omyga美色女影城.援交友聊天室.sogo論壇.金瓶交流區18.性感影片odd.性福免費影片分享.45avdvd.台灣18a片網.sex520 net免費影片.a383girl影音城.okav成人影城.avhigh免費影片.avhigh無碼影片.okav免費無碼.免費成人片.網路交友hibb.okav免費影片.台灣avdvd專賣店.avdvd成人影城.avdvd無碼a片

11:33 PM  
Blogger kentcabe said...

交友104速配網,視訊交友,成人韭南籽,18成人,ut男同志聊天室,成人圖片區,交友104相親網,0951成人頻道下載,男同志聊天室,成人貼圖,成人影片,tt1069同志交友網,成人視訊,aio交友愛情館,情色視訊,情色視訊,色情遊戲,交友戀愛小站,jp成人,熊貓貼圖,成人圖片,成人文章,正妹,成人小說,杜蕾斯成人,ut 聊天室,熊貓貼圖區,交友聊天找e爵,ol制服美女影片,777成人區,bt成人,女同志聊天室,貼圖片區,一葉情貼圖片區,6k聊天室,69成人,成人貼圖站,色情影片,聊天室ut,免費成人影片,

1:34 AM  
Blogger frankbmullaney said...

king7777net免費交友ggo台灣18成人新竹聊天室網愛聊天室寫真集影片免費寫真女郎影片情色典獄長自拍照918色情故事a片貼圖sexygirlsgetfucked自拍片本土自拍天堂免費色情援交友拉子聊天室歐美影片免費聊天室

6:42 AM  
Blogger 嘴唇 said...

TAHNKS FOR YOUR SHARING~~~VERY NICE ........................................

11:32 PM  
Blogger 筱婷筱婷 said...

hello~welcome my world~<. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10:19 PM  
Blogger JeremiahRenne332 said...

向小善致敬,它使人生旅程較為平順。......................................................

12:54 AM  
Blogger 怡君 said...

Make yourself necessary to someone.........................................

5:23 PM  
Blogger 雅婷雅婷宛佳 said...

我只知道,假如我去愛人生,那人生一定也會回愛我........................................

7:24 AM  
Blogger FrederickBove98787 said...

在莫非定律中有項笨蛋定律:「一個組織中的笨蛋,恆大於等於三分之二。」...............................................................

1:02 PM  
Blogger 登山 said...

Just do it!.......................................................

6:51 PM  
Blogger 伯臻 said...

當一個人內心能容納兩樣相互衝突的東西,這個人便開始變得有價值了。..................................................

8:30 PM  
Blogger 文佩齊華 said...

不錯唷~我會常常來 >"<..................................................

7:29 PM  
Blogger 昱宏彥良 said...

haha~ funny! thank you for your share~........................................

12:17 AM  
Blogger 佩GailBohanan1蓉 said...

與人相處不妨多用眼睛說話,多用嘴巴思考,...............................................................

7:08 AM  
Blogger 黃佳伸 said...

凡走過必留下痕跡!不留言對不起你!.................................................................

3:43 PM  
Blogger 文群文群 said...

Poverty tries friends.....................................................................

1:51 AM  
Blogger 亦妮亦妮 said...

河水永遠是相同的,可是每一剎那又都是新的。.................................................................

8:52 PM  
Blogger 于庭 said...

成熟,就是有能力適應生活中的模糊。.................................................................

2:34 AM  
Blogger 宥妃宥妃 said...

成功多屬於那些很快做出決定,卻又不輕易變更的人。而失敗也經常屬於那些很難做出決定,卻又經常變更的人.................................................................

7:47 PM  
Blogger 玉苓玉苓 said...

great msg for me, thanks a lot dude˙﹏˙..................................................................

10:56 PM  
Blogger 佳皓佳皓 said...

好喜歡你的部落格唷,剛下班,要去睡了!!!掰~~..................................................................

12:31 AM  
Blogger 吳婷婷 said...

One swallow does not make a summer.............................................................

12:09 PM  
Blogger 楊儀卉 said...

凡事三思而行,跑得太快是會滑倒的。.......................................................

11:17 PM  
Blogger KyungBivo中如 said...

融會貫通的智慧,永遠不會被遺忘。..................................................

3:44 PM  
Blogger 芳瑜芳瑜 said...

Failure is the mother of success...................................................

12:42 AM  
Blogger 洪勳華 said...

工作,是愛的具體化~~~~努力吧!............................................................

10:24 PM  
Blogger 童祖如童祖如 said...

人不能像動物一樣活著,而應該追求知識和美德............................................................

8:58 PM  
Blogger 王順蔡秉源如 said...

生存乃是不斷地在內心與靈魂交戰;寫作是坐著審判自己。. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10:06 PM  
Blogger 家唐銘 said...

分享的朋友,在精不在多,分享的幸福就在下一秒................................................

12:31 AM  
Blogger 祖成魏魏祖成魏魏 said...

你要保守你的心,勝過保守一切,因為一生的果效是由心發出................................................

12:31 AM  
Blogger 文王廷 said...

Quality is better than quantity...................................................................

12:31 AM  
Blogger 張黃柏亞武茜 said...

Subtlety is better than force. ............................................................

6:29 PM  
Blogger 承王蓁 said...

唯有穿鞋的人,才知道鞋的哪一處擠腳......................................................................

6:43 AM  
Blogger 怡靜怡靜怡靜怡雯 said...

從來名利地,皆起是非心。.....................................................

8:27 PM  
Blogger 孫邦柔 said...

一個人的際遇在第一次總是最深刻的,有時候甚至會讓人的心變成永遠的絕緣。......................................................................

3:16 AM  
Blogger 怡屏 said...

人類的聰明,並非以經驗為依歸,而是以接受經驗的行程為依歸。..................................................

4:30 AM  
Blogger 靜蔡蔡蔡蔡怡 said...

噴泉的高度,不會超過它的源頭。一個人的事業也是如此,它的成就絕不會超過自己的信念。........................................ ........................

4:09 PM  
Blogger 三琪 said...

幸福不是一切,人還有責任。..................................................... ............

6:04 AM  

Post a Comment

Links to this post:

Create a Link

<< Home