Stephen E. Arnold: Open Source Big Data Tool Combines MapR with Elasticsearch

IO Tools
Stephen E. Arnold
Stephen E. Arnold

MapR Integrates Elasticsearch into Platform

Writer Christopher Tozzi opens his Var Guy article, “MapR, Elasticsearch Partner on Open Source Big Data Search,” with a good question: With so many Hadoop distributions out there, what makes one stand out? MapR hopes an integration with Elasticsearch will help them with that. The move brings to MapR, as the companies put it, “a scalable, distributed architecture to quickly perform search and discovery across tremendous amounts of information.” They report that several high-profile clients are already using the integrated platform.

Tozzi concludes with an interesting observation:

“From the channel perspective, the most important part of this story is about the open source Hadoop Big Data world becoming an even more diverse ecosystem where solutions depend on collaboration between a variety of independent parties. Companies such as MapR have been repackaging the core Hadoop code and distributing it in value-added, enterprise-ready form for some time, but Elasticsearch integration into MapR is a sign that Hadoop distributions also need to incorporate other open source Big Data technologies, which they do not build themselves, to maximize usability for the enterprise.”

It will be interesting to see how that need plays out throughout the field. MapR is headquartered in San Jose, California, and was launched in 2009. Formed in 2012, Elasticsearch is based in Amsterdam. Both Hadoop-happy companies maintain offices around the world, and each proudly counts some hefty organizations among their customers.

Cynthia Murrell, May 07, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

Worth a Look: Whitebox Geospatial Analysis Tools — Open Source and Cross-Platform

Geospatial, IO Tools

Whitebox Geospatial Analysis Tools (GAT) is an open-source and cross-platform Geographic information system (GIS) and remote sensing software package that is distributed under the GNU General Public License. It has been developed by the members of the University of Guelph Centre for Hydrogeomatics and is intended for advanced geospatial analysis and data visualization in research and education settings. The package features a friendly graphical user interface (GUI) with help and documentation built into the dialog boxes for each of the more than 360 analysis tools. Users are also able to access extensive off-line and online help resources. The Whitebox GAT project started as a replacement for the Terrain Analysis System (TAS), a geospatial analysis software package written by John Lindsay. The current release support raster and vector (Shapefile) data structures.

Whitebox GAT is extendible. Users are able to create and add custom tools or plugins using any JVM language. The software also allows scripting using the Groovy, Python and JavaScript programming languages.

Whitebox Home Page

Robin Good: Tool Kit for Learning How to Code

IO Tools
Robin Good
Robin Good

A Curated Guide About The Best Places Where To Learn How To Code: Bento

Bento is a website that, thanks to its author Jon Chan and the many user contributions, has gathered, organized and curated the very best resources available online where you can learn how to code. From html to javascript, ruby, php, Java, perl, Bento offers learning guidance for over 80 different technologies and coding languages.

Here is how Jon Chan, a 23 years old who launched this project in September of 2013, describes Bento:

Bento is what I would have liked to have when I was learning to code. I started learning to code when I was very young – about ten years old. Then, the only things I had available were what I could find online and through a few dense books. Now, people have the exact opposite problem: how do you break through the noise and find what's actually valuable to learn? This site is here to help you figure that out.”

Bento is a perfect example of effective content curation as it does not simply collect and list all of the resources available to learn each language but it only suggests the very best ones, organizing them in easy, medium and hard and providing also “best of” / direct solutions that save readers lots of valuable time. Free to use.

Useful, simple and immediate to use. Well organized. 9/10

Bento: http://www.bentobox.io/

More info: http://www.bentobox.io/about

Submit new links here: https://github.com/JonHMChan/bento/

Patrick Meier: Got TweetCred? Use This Tool To Automatically Identify Credible Tweets

Crowd-Sourcing, Geospatial, Innovation, IO Tools, P2P / Panarchy, Politics, Resilience
Patrick Meier
Patrick Meier

Got TweetCred? Use it To Automatically Identify Credible Tweets

What if there were a way to automatically identify credible tweets during major events like disasters? Sounds rather far-fetched, right? Think again.

The new field of Digital Information Forensics is increasingly making use of Big Data analytics and techniques from artificial intelligence like machine learning to automatically verify social media. This is how my QCRI colleague ChaTo et al. already predicted both credible and non-credible tweets generated after the Chile Earthquake (with an accuracy of 86%). Meanwhile, my colleagues Aditi, et al. from IIIT Delhi also used machine learning to automatically rank the credibility of some 35 million tweets generated during a dozen major international events such as the UK Riots and the Libya Crisis. So we teamed up with Aditi et al. to turn those academic findings into TweetCred, a free app that identifies credible tweets automatically.

We’ve just launched the very first version of TweetCred—key word being first. This means that our new app is still experimental. On the plus side, since TweetCred is powered by machine learning, it will become increasingly accurate over time as more users make use of the app and “teach” it the difference between credible and non-credible tweets. Teaching TweetCred is as simple as a click of the mouse. Take the tweet below, for example.

Read full post with more links.

Robin Good: Flipped Up Twitter Feed Tool – Vellum Good Stuff

IO Tools
Robin Good
Robin Good

A Flipped-Up Twitter Feed with Only The Good Stuff In It: Vellum

If you find tracking news on Twitter a difficult task due to the amount of stories showing up, and the often missing context helping you understand the value and relevance of what is being shared, here is a new tool that may help you quiet down the visual noise and find more rapidly what is really important.

Vellum is a new free web app born out of a quick experiment at the New York Times R&D labs which allows you to see all of the most relevant Twitter stories coming from the people you follow, stripped of their commentary and showing their original title, description and source.

Vellum filters out text only tweets that contain no links, eliminates duplicates and surfaces only those tweets that have already been retweeted by multiple people.

Vellum acts as a reading list  for your Twitter feed, finding all the links that are being shared by those you follow on Twitter and displaying them each with their full titles and descriptions.

This flips the Twitter model, treating the links as primary and the commentary as secondary (you can still see all the tweets about each link, but they are less prominent).

Continue reading “Robin Good: Flipped Up Twitter Feed Tool – Vellum Good Stuff”