Wednesday, 2 October 2013
Saturday, 28 September 2013
Friday, 21 June 2013
Tuesday, 18 June 2013
Zotero and R: automatically find relevant scientific articles with the Microsoft Academic Search API
The principle is simple: find articles that are most frequently cited by the authors in your Zotero database (and which you haven't yet read) and find other articles that cite the articles in your database. Rather than using keyword similarity algorithms, this script just assumes that Authors who think and read similarly as you do probably know what's relevant.
Below is the code which can be yanked to your R terminal. There is one source-code file that should be downloaded into your working directory (download from here) and called: "SQL_zotero_query.txt" (thank you Royce). The sections which need to be customized are the location of: a) your Zotero database folder (which has zotero.sqlite) and b) the folder to save the output html files. If you use this frequently, you should also get your own free MSA API key (mine is provided below but has limited amount of queries allowed).
Enjoy! Please send me any suggestions and questions!
And some example output...
Why R? The above script just serves as a one-stop-shop for SQL and JSON processing. One the side, I also use R's wonderful visualization tools and matrix processing facilities to play around with author and keywords. But really, the above script could probably be run more efficiently in Python or Java.
A special thanks to the post by Royce Kimmons at http://royce.kimmons.me/tutorials/zotero_to_excel for the SQL command to access Zotero databases.
BTW, in case you're wondering why I'm using two open-source projects with a Microsoft project: other online and free-tools such as CiteUlike or CiteseerX do NOT provide the needed forwards-citation or backwards-citation information, neither through an API or thourgh webscrapping. I'd love some alternatives
Sunday, 19 May 2013
Lindsay and I celebrated this past Earth Day by finally getting some backyard chickens in Haiku, HI. What could be better than combining composting, egg production, useless yardspace, and companionship!
In a typically Maui fashion, our hens are a bit strange, being rescue hens of unknown provenance from Maui's 'boo boo zoo'. The private animal rehab center for domestic animals is one of many controversial private animal sanctuaries, notorious for a strict no-kill-policy. Broken limbs, blindness, illness, no euthenasia whatsoever.
While such a policy is perhaps cruel, the greater tragedy is how large amounts of private money (often from one or two single rich donors) flows into so many dubious domestic animal sanctuaries, while programs targeted at endangered Hawaiian species go underfunded. Hawaii has lost the majority of its endemic species and more are on the way out. How great it would be if one or two rich Americans made it their 'pet cause' to fund the conservation efforts aimed at the Maui parrotbill (Only ~500 left) rather than rescuing species that are invasive and at zero risk of extiction.
Dispite good intentions, such suboptimal outcomes highlight the fallacy of the Libertarian (and American?) narative that we should facilitate rich people having more money through low taxation and let them reinvest the money into the economy as they see fit, rather than tax and use the resources for needed projects. The extremely rich just funnel scarce resources into crazy pet projects of dubious value.
BTW, volunteering / donating to the Maui Parrotbill project is very fun, even more fun than chickens, and extremely valuable. Check out www.mfbrp.org
Tuesday, 19 February 2013
And because I love ordination diagrams, below is a closer look at the Blickey analysis: quantitative/technical skills and the management/interpersonal fields seem to be on opposing ends of the 1st Principal Component, suggesting that grade students may have to decide early to bet on succeeding as a technical person / senior scientist, or as a manager type. The study clearly states what it thinks is a winning strategy: '“. . . there are a lot of things you can learn, but [interpersonal skills are] the hardest to teach.”'
I think the quantitative side of my brain has cannabalized the portion devoted to interpersonal skills, so I have no advice to give on this matter. But, in terms of the responding need for GIS savvyness: here are my two cents:
2) if you are serious about working with large ecological data or serious about taking up GIS, beware of classes/programmes that are little more than ESRI tutorials: you will be set up with a platform of limitation and disappointment. Even at the highest echelons of ArcMastery (and expensive licenses), you'll inevitably end up having to tell your superiors that you couldn't complete such-and-such a task because 'ArcGIS doesn't do that.' (But hey, that's a good looking map!)
Getting really good at ArcGIS is like becoming a master of Macromedia right before Flash came out: they jump from Avenue, to VB, to Python, to .... what's next? Instead, if you use R for GIS, there is always a way to do what you want. It may be difficult, but mastering R for a difficult GIS task yields transferable skills in a host of disciplines. It used to be a huge pain, but recent libraries like 'rgeos' (mixed with 'rgdal' and 'raster') give users most of the cookie-cutter facilities familiar to ESRI users. And its free, open-source (more on this later...)
I hope to have a little tutorial on GIS'ing in R. Until then, the already R-acquainted can leap into the subject with the following advice:
Getting started with GIS in R
1) for any questions, always start your Google/DuckDuckGo queries with 'R-sig-geo': the listserv archives are replete with questions and answers to the issues you will inevitably have (and far better than ESRI documentation).
2) get acquainted with he internal data structures of gridded data and vector data in the 'sp' package, e.g.,
> ?SpatialGridDataFrame... to the point of being able to reconstruct the structures from stratch. HINT: they are lists of lists of lists of...
3) learn about the 'Proj.4' syntax of defining projections/coordinate systems. There is a much larger context to this project, but as a starting point one can just bookmark the more usual coordinate systems and projections, such as WGS84 is "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs", or the UTM zone 9N (e.g., for British Columbia Canada)could be "+proj=utm +zone=9 +ellps=GRS80 +units=m +no_defs".
4) learn the examples in the help files of the following core GIS libraries:
rgeos: vector data basic operations, like unions, buffers, spatial sampling, etc.
rgdal: GDAL library to read and write a variety of raster datasets (see 'writeGDAL(...)') : GeoTIFFS, ESRI grids, floats, etc. It also provides the ability to reproject vector data (see 'spTransform').
maptools: basic GIS facilities, including 'readShapePoly(...)' for easy import of ESRI shapefiles.
raster: clip, shrink, reproject, resample, stack rasters -- a parallel (and better) way of representing gridded data (seemingly a rival to the SpatialGridDataFrame?). Despite the one-line-of-code annoyance of switching between SGDF class and raster class, this package takes the cake for handling of rasters. (For those of you taking note, you'll notice that, yes, there are TWO different libraries for projecting vector versus raster data).
5) learn about plotting maps with the spplot(...) function. An entire book could be written on spplot(), but start with col.regions=terrain.colors(100) for decent colours.
There are a few extra steps to get the GIS libraries running in Linux, in particular, installing the libraries upon which 'rgeos', 'maptools', and 'rgdal' depend. Even though the dependencies are documented in the respective packages pages, I still found it a bit tricky. First, the libraries you want are often the 'dev' versions (e.g., libproj-dev), as explained in this post. I was generally successfully in Ubuntu by with:
> sudo apt-get install libproj-dev libgdal1-dev.
Mercifully, there is a dedicated repository of GIS libraries for Ubuntu and Debian flavours. You can add the repository to your source list by entering the following in your terminal:
> sudo add-apt-repository ppa:ubuntugis/ubuntugis-unstable
1 Blickley, Jessica L., Kristy Deiner, Kelly Garbach, Iara Lacher, Mariah H. Meek, Lauren M. Porensky, Marit L. Wilkerson, Eric M. Winford, and Mark W. Schwartz. “Graduate Student’s Guide to Necessary Skills for Nonacademic Conservation Careers.” Conservation Biology 27, no. 1 (2013): 24–34. doi:10.1111/j.1523-1739.2012.01956.x.
Saturday, 16 February 2013
And so my mind thinks of a cyber outsider that has me excited: DuckDuckGo. It is described as a hybrid search engine, pulling results from Yahoo, Wolfram Alpha, Wikipedia, and its own crawler. It is generating lots of buzz, perhaps being the only contender against the Google. For me, its open-source, tweakable platform tickles my nagging sense of indignation with Google and its censor-happy tendencies. DuckDuckGo promises zero tracking, privacy, and decent searches. For me, it does well for R and science related searches, which is important (although Google is still the best for pulling results from academia, Stack Overflow, and relevant science sites, etc).
Has it occurred to you that Google has turned to the dark side? For me, the realization came a long time ago, before SOPA and other aggressive attacks on internet freedom. Consider the story of the deceased Gigapedia.org and Library.nu way back in 2006: these websites hosted free (& pirated) college textbooks, and were blacklisted from the Google results. To add insult to injury, when Library.nu was taken down in early 2012, the domain name actually redirected you to Books.Google.Com (wow!): now, library.nu directs you to internationalpublishers.org, which is a great study of double-think.
For me, I'll give DuckDuckGo a good chance.
My cannot-do-without Search tools:(I miss the duck).
Firefox InstaFox Add-on: to do all sorts of search engine searches from your browser address bar. Tailor to use d+space for DuckDuckGo. I have 'ci' for CiteUlike.org, 'me' for Mendeley, etc.
GNOME Do: For linux users, why point and click when the Super+Space allows you access to web searches, applications, recent files, music, system settings, etc.
Wednesday, 13 February 2013
Instructions:The app is admittedly primitive: just a refresh button to get a new marine mammal article, plus some facilities for directed browsing. Enjoy! The real engine is the Wikipedia Random project.
> Download the .apk file here
> Move the .apk onto your android device, via bluetooth or usb or whatever. Move it anywhere in the android file structure, but make sure you can find it again within your Android file manager (I use ES File Manager).
> Detach fron computer and turn on your Android. Browse to the .apk file and tab. Follow instructions to install!
> To locate the Random Whale Wiki icon, you'll need to manually drag it onto your 'desktop' through Settings--> Apps.
Personally, I think Wikipedia is great for learning about science. Recent innovations like the organized Portals are downright exciting! I like the Biology Portal. Who would want the linear bore of an intro college textbook, when you can have the organic cluster of articles that allow you to click-through and venture out into the Wikispace as far as you are willing to go? Just be mindful of Wikipedia's biases and reliability.
Are you a fan of Wikipedia random-browsing? How do you use the web's 2.0 resources for edutainment?