Tuesday, 19 February 2013

Advice for ecology grad students... (Science)

ECOLOG is the the largest listserv devoted to ecology-based topics (also one of the most bizarre listserv I've encountered). There has been a lively discussion recently about a publication by Blickley et al (2013). They provided a thoughtful analysis of the skills needed by ecology/conservation employers, inferred from job-postings on the web. The community's response has been hearty surprise by the study's emphasis on 'project management / interpersonal skills', trumping technical skills. This conclusion seems consistent with a quick head-count around the NOAA office: most of my colleagues are administrators, managers, coordinators (with impressive technical qualifications) while dedicated Quants are few and far between. In contrast, the community gave a resounding 'learn GIS' rejoinder to the study.

And because I love ordination diagrams, below is a closer look at the Blickey analysis: quantitative/technical skills and the management/interpersonal fields seem to be on opposing ends of the 1st Principal Component, suggesting that grade students may have to decide early to bet on succeeding as a technical person / senior scientist, or as a manager type. The study clearly states what it thinks is a winning strategy: '“. . . there are a lot of things you can learn, but [interpersonal skills are] the hardest to teach.”'



I think the quantitative side of my brain has cannabalized the portion devoted to interpersonal skills, so I have no advice to give on this matter. But, in terms of the responding need for GIS savvyness: here are my two cents:
1) everything in ecology has a space-time context, and colleagues without basic GIS facilities are frustratingly difficult to work or communicate with.

2) if you are serious about working with large ecological data or serious about taking up GIS, beware of classes/programmes that are little more than ESRI tutorials: you will be set up with a platform of limitation and disappointment. Even at the highest echelons of ArcMastery (and expensive licenses), you'll inevitably end up having to tell your superiors that you couldn't complete such-and-such a task because 'ArcGIS doesn't do that.' (But hey, that's a good looking map!)

Getting really good at ArcGIS is like becoming a master of Macromedia right before Flash came out: they jump from Avenue, to VB, to Python, to .... what's next? Instead, if you use R for GIS, there is always a way to do what you want. It may be difficult, but mastering R for a difficult GIS task yields transferable skills in a host of disciplines. It used to be a huge pain, but recent libraries like 'rgeos' (mixed with 'rgdal' and 'raster') give users most of the cookie-cutter facilities familiar to ESRI users. And its free, open-source (more on this later...)

I hope to have a little tutorial on GIS'ing in R. Until then, the already R-acquainted can leap into the subject with the following advice:

Getting started with GIS in R

1) for any questions, always start your Google/DuckDuckGo queries with 'R-sig-geo': the listserv archives are replete with questions and answers to the issues you will inevitably have (and far better than ESRI documentation).

2) get acquainted with he internal data structures of gridded data and vector data in the 'sp' package, e.g.,
> ?SpatialGridDataFrame
> ?SpatialPointsDataFrame
... to the point of being able to reconstruct the structures from stratch. HINT: they are lists of lists of lists of...

3) learn about the 'Proj.4' syntax of defining projections/coordinate systems. There is a much larger context to this project, but as a starting point one can just bookmark the more usual coordinate systems and projections, such as WGS84 is "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs", or the UTM zone 9N (e.g., for British Columbia Canada)could be "+proj=utm +zone=9 +ellps=GRS80 +units=m +no_defs".

4) learn the examples in the help files of the following core GIS libraries:
rgeos: vector data basic operations, like unions, buffers, spatial sampling, etc.
rgdal: GDAL library to read and write a variety of raster datasets (see 'writeGDAL(...)') : GeoTIFFS, ESRI grids, floats, etc. It also provides the ability to reproject vector data (see 'spTransform').
maptools: basic GIS facilities, including 'readShapePoly(...)' for easy import of ESRI shapefiles.
raster: clip, shrink, reproject, resample, stack rasters -- a parallel (and better) way of representing gridded data (seemingly a rival to the SpatialGridDataFrame?). Despite the one-line-of-code annoyance of switching between SGDF class and raster class, this package takes the cake for handling of rasters. (For those of you taking note, you'll notice that, yes, there are TWO different libraries for projecting vector versus raster data).

5) learn about plotting maps with the spplot(...) function. An entire book could be written on spplot(), but start with col.regions=terrain.colors(100) for decent colours.

Linux Users
There are a few extra steps to get the GIS libraries running in Linux, in particular, installing the libraries upon which 'rgeos', 'maptools', and 'rgdal' depend. Even though the dependencies are documented in the respective packages pages, I still found it a bit tricky. First, the libraries you want are often the 'dev' versions (e.g., libproj-dev), as explained in this post. I was generally successfully in Ubuntu by with:
> sudo apt-get install libproj-dev libgdal1-dev
.

Mercifully, there is a dedicated repository of GIS libraries for Ubuntu and Debian flavours. You can add the repository to your source list by entering the following in your terminal:
> sudo add-apt-repository ppa:ubuntugis/ubuntugis-unstable


1 Blickley, Jessica L., Kristy Deiner, Kelly Garbach, Iara Lacher, Mariah H. Meek, Lauren M. Porensky, Marit L. Wilkerson, Eric M. Winford, and Mark W. Schwartz. “Graduate Student’s Guide to Necessary Skills for Nonacademic Conservation Careers.” Conservation Biology 27, no. 1 (2013): 24–34. doi:10.1111/j.1523-1739.2012.01956.x.

Saturday, 16 February 2013

DuckDuckGone

This valentine's I'm saddened by the surprise demise of a duck--likely taken by a fox on one of these ill-weather days while I was in Canada. I live on a small hobby farm outside Washington DC, in Boyds, where simple chickens, goats, horses and duck have been companions to ground my psyche to the here and now from the abstract coding at NOAA. The duck never quite fit in among the other animals, having lost his con-specifics to a fox raid early in the year. I liked him the best: he'd actually wake me up in the mornings begging for bananas and oats.

And so my mind thinks of a cyber outsider that has me excited: DuckDuckGo. It is described as a hybrid search engine, pulling results from Yahoo, Wolfram Alpha, Wikipedia, and its own crawler. It is generating lots of buzz, perhaps being the only contender against the Google. For me, its open-source, tweakable platform tickles my nagging sense of indignation with Google and its censor-happy tendencies. DuckDuckGo promises zero tracking, privacy, and decent searches. For me, it does well for R and science related searches, which is important (although Google is still the best for pulling results from academia, Stack Overflow, and relevant science sites, etc).

Has it occurred to you that Google has turned to the dark side? For me, the realization came a long time ago, before SOPA and other aggressive attacks on internet freedom. Consider the story of the deceased Gigapedia.org and Library.nu way back in 2006: these websites hosted free (& pirated) college textbooks, and were blacklisted from the Google results. To add insult to injury, when Library.nu was taken down in early 2012, the domain name actually redirected you to Books.Google.Com (wow!): now, library.nu directs you to internationalpublishers.org, which is a great study of double-think.
For me, I'll give DuckDuckGo a good chance.
My cannot-do-without Search tools:

Firefox InstaFox Add-on: to do all sorts of search engine searches from your browser address bar. Tailor to use d+space for DuckDuckGo. I have 'ci' for CiteUlike.org, 'me' for Mendeley, etc.

GNOME Do: For linux users, why point and click when the Super+Space allows you access to web searches, applications, recent files, music, system settings, etc.
(I miss the duck).

Wednesday, 13 February 2013

The Random Whale Wiki App - Android

Ever want to just stumple-upon random Cetacean information from Wikipedia? Here is an App for that!  I made it as a fun way to lazily learn about marine mammals on my tablet while in bed: just lie back and learn about the fascinating behaviour, evolution, phylogeny and all the mangled bytes on Wikipedia's Cetacean portal.

Instructions:
> Download the .apk file here
> Move the .apk onto your android device, via bluetooth or usb or whatever. Move it anywhere in the android file structure, but make sure you can find it again within your Android file manager (I use ES File Manager).
> Detach fron computer and turn on your Android. Browse to the .apk file and tab. Follow instructions to install!
> To locate the Random Whale Wiki icon, you'll need to manually drag it onto your 'desktop' through Settings--> Apps.
The app is admittedly primitive: just a refresh button to get a new marine mammal article, plus some facilities for directed browsing. Enjoy! The real engine is the Wikipedia Random project.

Personally, I think Wikipedia is great for learning about science. Recent innovations like the organized Portals are downright exciting! I like the Biology Portal. Who would want the linear bore of an intro college textbook, when you can have the organic cluster of articles that allow you to click-through and venture out into the Wikispace as far as you are willing to go?  Just be mindful of Wikipedia's biases and reliability.

Are you a fan of Wikipedia random-browsing? How do you use the web's 2.0 resources for edutainment?