Friday, October 27, 2006

Odyssey of a Research Student, Episode I

Long lost in the direction and topic of my research, now I gotta work on something concrete. At least in the following month, my direction will be focused on LC with HowNet dictionary (for my first paper, hopefully) and sentence segmentation (for the course project). Today I kicked off with some tests on measuring word association. Not until today I didn't discover the G-square test done by Nicola Stokes was based on Ted Peterson's "Fishing for Exactness". Another useful (and frequently quoted) paper is the one "Acquiring Collocations for Lexical Choice between Near-Synonyms" by Inkpen and Hirst. I'll work on these statiscal tests to verify which one is the best, accompanied by some preliminary evaluation. Good luck Kelvin!

Thursday, October 05, 2006

Keith van Rijsbergen

I should start telling about all this from my Wikipedia search for "SIGIR".
From the search result I discovered there is this "Salton Award" stuff in the SIGIR conference each year. The first thing that caught my eyes is nothing but the paper by the awardee of SIGIR '06. The title "Quantum Haystacks" is really attractive, in the midst of the quantum computing fervor. My glimpse on this paper, brought me the excitement, albeit I was very obscured, urged me to find more about this guy van Rijsbergen. Surprisingly his newest book "The Geometry of Information Retrieval" is within the CUHK library collection. Without a single second of hesitation I darted to the library to snatch the book. Well... flipping through the first few pages and the math about Hilbert Space didn't bring me new insights into my research, but I am pretty sure his theory is gonna bring impact to the world of IR. Here comes a chance for me to get to know this guy, if I am fortunate enough, the ESSIR 2007:
http://www.dcs.gla.ac.uk/essir2007/index.html

Wednesday, October 04, 2006

Finally back

I should have started to resume writing since I'm back from Beijing. Oblivious as I am, I neglected the existence of the blog completely. Now I'm back. I believe the I'm the sole reader of the blog. Anyway, just for the purpose of keeping track of my work. Also it is a good idea to keep track of the papers that I've read/ 've been reading /'ve finished.

Highlight of some current work:
1) Lexical Cohesion - Based on the thesis of Nicolas Stokes, Dr. Xie and I are working on a Mandarin Chinese Corpus of VOA. I finished a preliminary draft (v0.2) yesterday. Some interesting discussion between us yesterday, and the next step will be to play with these ideas. Details are coming.

2) Today's reading: Ma, B., Li, H., A Phonotactic-Semantic Paradigm for Automatic Spoken Document Classification, In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 369-376.
Wow, Sinkee dudes, cool!