Neal Rauhauser: DataScienceCentral, LinkedIn, Wrong Tool

Crowd-Sourcing, P2P / Panarchy
Neal Rauhauser
Neal Rauhauser

DataScienceCentral Users – No Klout Via LinkedIn

The retrieval of 11,712 DataScienceCentral profiles completed earlier today. I found about 2,600 instances of ‘linkedin’ when I checked the extract dt/dd files. I rigged up a bunch of awk statements and distilled that down to about 900 DSC userids and the associated plain text LinkedIn userids. I skipped the ones with embedded slashes, figuring I could go back later, and then I hit the roadblock you see above.

Choosing The Wrong Tool

When we left off last night we had about 55,000 keywords from about 11,500 Data Science Central profiles. I manually reviewed and filtered, tossing out stuff that I felt was noise. Here are those nodes after I left the ForceAtlas2 layout algorithm work while I was making breakfast.

Automated community detection in Gephi found eleven communities using the default values. If you’re a puzzled Maltego user trying to follow along, community detection is an algorithm that examines edges and assigns nodes to groups, which can then be colored. This is what you are doing manually with the five colored stars Maltego lets you use to group entities.

