October 14, 2004

Index all those scientific papers of yours

O'Reilly Network: Google Your Desktop. It's finally here. What I've been waiting for google to do for such a long time: index my files for me so that I could find my papers faster. Read the excellent review by Rael Dornfest in the link above.

I no longer will rely on the Windows' search program to find my files. In our line of business, we have to be able to refer to papers quickly and easily during the writing process. I've made it a point long ago that I was going to save every paper that I read (or was going to read) as the original pdf locally on my hard drive. But alas, I quickly amassed a cache of papers that I had difficulty organizing. I tried a few things such as different directories for each journal or organizing by year, but both methods proved lacking. Recently, I just threw all the papers in one directory called "My Papers" and had them sit there indefinitely until I figured something out to deal with them. But alas, google has provided me with a great tool. If I use Google Desktop in conjunction with PubMed/Medline, I could easily find out if a paper I discovered through Medline is already in my hard drive.

My only wish is that google continues to improve its search tool to include every aspect that it has for web searches. Most importantly, google needs to index not just the names of the pdf files, but the content within the pdf file, just like it does for pdf files found on the web. A quick click on "View as text" could give me a preliminary confirmation on whether the file was what I wanted. I realize that it works for the web search because google caches those converted pdf files, but the disk space investment needed by my computer to cache my pdf files is something I'm willing to give up to help me in my line of work.

I'm going to do a lot more testing of Google Desktop. I wonder if it'll index network directories? What if I mounted my Linux home directory via samba and mapped it to a drive letter, I wonder if google desktop will crawl through it. Anyhow, it would also be nice if google released, say, a *nix daemon that would crawl through the /home directory (or any dir that I told it to), so that I could search my linux desktop. It should be fairly straightforward for them to port it to linux and have it run as a service like CUPS does for printing. I.e. I could access Google Desktop via my linux browser (mozilla or firefox) pointing to http://127.0.0.1:4664/&s=xxxxxxxxxx. I hope this is on their agenda.

Posted by johnvu at October 14, 2004 11:51 AM
Comments
Post a comment