Professional Documents
Culture Documents
com)
MyDice
Talent Communities
(http://www.dice.com)
Search the Blog Network
Search
(HTTP://WWW.REPOST.US/ARTICLEIN TECH (HTTP://NEWS.DICE.COM/CATEGORY/WORKING-IN-TECH/) ANALYTICS/#RESPOND) PREVIEW/HASH/46C31E32A6846069583F2E911F5B412D/) TAGS: DATA (HTTP://NEWS.DICE.COM/TAG/DATA/), JOB SKILLS (HTTP://NEWS.DICE.COM/TAG/JOB-SKILLS/) Tw eet 6 Share 8 2 Like 0
StumbleUpon
Search
In many Big Data analysis blogs, at Big Data meetups and in the halls of the most recent OReilly Strata Conference, one of the most-discussed topics is which language is better for data analysis: Python (http://www.python.org/) or R (http://www.r-project.org/). Some of the talk has even reached religious overtones not unlike previous discussions on Windows (http://www.dice.com/job/results? caller=searchagain&j=true&q=Windows&src=19&x=all&p=/?icid=dicenews) vs. Linux (http://www.dice.com/job/results?caller=searchagain&j=true&q=Linux&src=19&x=all&p=/? icid=dicenews) or Microsofts (http://www.dice.com/jobsearch/company/DiceId_microwa/Microsoft+Corporation/?icid=dicenews) Internet Explorer vs. Mozilla Firefox.
(http://cdn.dice.com/wpcontent/uploads/2014/02/Julia-LanguageLogo.png)So whats the issue here? Why are Big Data analysts (http://www.dice.com/job/results?
(http://www.dice.com/job/result/nihaki/497348? src=19&icidcnt=CT:AS) caller=searchagain&j=true&q=Big+Data+Analyst&src=19&x=all&p=/?icid=dicenews) so concerned with what language to use? In my honest opinion, the root of the issue probably has more to do with the tools they learned on than anything else. But lets briefly look at each. Python Python (http://www.dice.com/job/results?caller=searchagain&q=Python&src=19&x=all&p=/? icid=dicenews) is a general purpose scripting language which can do many things, from complex data processing and data munging to the implementation of mathematical and algorithmic functions for machine learning. Many developers are comfortable with Python since its easier to learn than R (http://news.dice.com/2011/01/28/where-are-all-the-r-jobs-places-like-amazon-and-resonant-networksto-start/). As Python is a scripting language, it allows the data analyst to easily play around with data sources and data parsing ad-hoc without using a formal programming model. With the use of other libraries you can do text mining, vectorizethe text data and identify similarities between posts and texts. Python also has an OOP model, so having an OO language in your tool kit allows you to program structured and modular applications should that be your choice. This can be seen as an advantage over R (http://www.dice.com/job/results?caller=searchagain&q=R&src=19&x=all&p=/?icid=dicenews). R Director of Quality - QA, electrical, ISO, QS, TS, PCB, PCBA
CyberCoders - Berlin, OH
(http://www.dice.com/job/result/10241294/341495?
R is an extremely rich environment, especially when you get into statistics. Inference, statistical modeling and then plotting your data on a bar, pie chart and histogram is trivial in R, as its formatted for statistical modeling using vectors and/or matrices. As R was created by statisticians for statisticians, someone who has a general knowledge of statistics usually finds it exceptionally easy to master. Programmers of other languages also seem to have an easy time learning and using it. If youre a data analyst who wants to see data distributions before drawing conclusions, R allows you to visualize outliers and data density. For probabilistic problems and distributions, and linear regression problems, Rs ease of use of data manipulation using vectors and matrices makes life exceptionally simple. With Rs statistics-rich library of algorithms, theres no need for understanding the specifics of data types, as would be required with Python. It has tremendous following and support, especially from the academic and commercial statistics communities, and now the Big Data analytics community. Python vs. R? Should you use one over another in Big Data analytics? I think that both are valuable and you should examine specifically what problem youre trying to solve. Both Python and R need to be in the data scientists (http://www.dice.com/job/results? caller=searchagain&j=true&q=Data+Scientist&src=19&x=all&p=/?icid=dicenews) and data analysts tool box, and a skilled Big Data professional should be ready to use either, depending on the problem theyre working on. A recent survey (http://www.kdnuggets.com/2013/12/poll-results-r-leading-python-gaining.html) of data scientists and data miners by KDNuggets found that R has a solid lead, and was used by about 77 percent of the voters. Python was used by about 32 percent of voters. When it comes to pay (http://news.dice.com/2014/01/29/tech-professionals-salaries-confidence-rise-dice-report/), the data scientists and data analysts who had the highest salaries knew R, according to Dice (http://www.dice.com/jobsearch/company/DiceId_Diceinc/Dice+Holdings%2C+Inc/?icid=dicenews). Is R better than Python? For some things. From a systems performance standpoint, it seems that the performance of R and Python is very much the same. An Alternative: Julia What is Julia? Its a high-level, high-performance dynamic programming language for technical computing. It naturally has many, many of the mathematical and statistical libraries found in any high performance environment. Its also very extensible: Theres a built-in package manager for the addition of new external libraries and packages. Julia is built for speed. Applications using it rather than Python or R have been found to be ridiculously fast. Here are some comparisons from the Julia Language website (http://julialang.org/blog/2013/03/julia-tutorial-MIT/): application fib quicksort mandel pi_sum Julia 0.91 1.14 0.85 1.0 Python 30.37 31.89 14.19 16.33 R 411.36 524.29 106.97 15.42
registerRemSw=0&SAVESEARCH=&op=300&caller=2&LOCATION
84.39164034&TRAVEL=0&SORTSPEC=0&FRMT=0&DAYSBACK=3
How do programs written in Julia run so fast? Because of its LLVM-based just-in-time (JIT) compiler, which is designed for a high performance environment. Julia is also designed for cloud computing and parallelism as it provides a number of key building blocks for distributed computation. That makes it flexible enough to support a number of styles of parallelism, and allows users to add more. Julia is also a very easy program to learn, use and debug. If you have previously done any kind of programming in C# (http://www.dice.com/job/results? caller=searchagain&q=C%23&src=19&x=all&p=/?icid=dicenews), C (http://www.dice.com/job/results?caller=searchagain&q=C&src=19&x=all&p=/?icid=dicenews), C++ (http://www.dice.com/job/results?caller=searchagain&q=C%2B%2B&src=19&x=all&p=/? icid=dicenews), Java (http://www.dice.com/job/results? caller=searchagain&q=Java&src=19&x=all&p=/?icid=dicenews), Python, R, etc., learning Julia should be a cake walk. A number of MIT video tutorials for learning Julia are located here (http://julialang.org/blog/2013/03/julia-tutorial-MIT/). Conclusion Will Julia replace Python or R? Not yet, since some libraries useful in performing Big Data analysis are just not available. However, with greater adoption, it could be the case within three years. After all, technological advances move very rapidly, especially when it comes to Big Data. Would I recommend Julia for Big Data? Like Python and R, I think it should be a part of every data scientists and data analysts tool kit.
Related Jobs
6. The 5 Best Ways to Contribute to Open Source Data Scientist Continuity Partners Inc. - New York, NY (http://www.dice.com/job/result/10109334/4390-Data-NY?src=19&q=Data Scientist&icidcnt=CT:AF) Projects Data Scientist Knewton - New York, NY (http://www.dice.com/job/result/10335702/420678?src=19&q=Data Scientist&icidcnt=CT:AF) (http://news.dice.com/2014/02/27/5Senior Data Scientist ICC - Columbus, OH (http://www.dice.com/job/result/INCONTRL/477995?src=19&q=Data Scientist&icidcnt=CT:AF) best-ways-contribute-open-sourceSee All (http://seeker.dice.com/jobsearch/servlet/JobSearch?registerRemSw=0&SAVESEARCH=&op=300&caller=2&LOCATION_OPTION=2&AREA_CODES=&ZIPCODE=&RADIUS=6 projects/) 7. Amazon Offers Developers Better Cross-Platform Tools (http://news.dice.com/2014/02/27/amazonoffers-developers-better-crossplatform-tools/) 8. Interview Questions for QA Analysts (http://news.dice.com/2014/02/27/interviewquestions-qa-analysts/) 9. More Companies Adopt No Jerk Policies 5 (http://news.dice.com/2014/02/27/jerkimpacts-job-prospects/) 10. 2 Free C Compilers Worth a Look
84.39164034&TRAVEL=0&SORTSPEC=0&FRMT=0&DAYSBACK=30&NUM_PER_PAGE=30&N=0&EXCLUDE_KEY1=&EXCLUDE_KEY2=&EXCLUDE_KEY3=&EXCLUDE_KEY4=&E
Post a Comment
Your email address will not be published. Required fields are marked
DICE BLOG NETWORK Write For Us (http://news.dice.com/writefor-the-dice-blog-network) Send Us News Tips (mailto:tips@dice.com)
TALENT COMMUNITIES
TECH JOBS
(http://www.dice.com)
Android Business Analyst (http://news.dice.com/android(http://seeker.dice.com/jobsearch/servlet/JobSearch? talent-community) op=300&values=&FREE_TEXT=Business+Analyst) Big Data Java (http://news.dice.com/big(http://seeker.dice.com/jobsearch/servlet/JobSearch? data-talent-community/) op=300&values=&FREE_TEXT=Java) Cloud Computing Project Manager (http://news.dice.com/cloud(http://seeker.dice.com/jobsearch/servlet/JobSearch? computing-talentop=300&values=&FREE_TEXT=Project+Manager) community/) .Net Mobile Development (http://seeker.dice.com/jobsearch/servlet/JobSearch? (http://news.dice.com/mobileop=300&values=&FREE_TEXT=%2ENet) development-talentOracle DBA community) (http://seeker.dice.com/jobsearch/servlet/JobSearch? Software Engineering op=300&values=&FREE_TEXT=Oracle+dba) (http://news.dice.com/category/softwareCobol engineering/) (http://seeker.dice.com/jobsearch/servlet/JobSearch? See All Communities op=300&values=&FREE_TEXT=Cobol) (http://news.dice.com/talentSAP community-landing(http://seeker.dice.com/jobsearch/servlet/JobSearch? page/) op=300&values=&FREE_TEXT=SAP) QA Tester (http://seeker.dice.com/jobsearch/servlet/JobSearch? op=300&values=&FREE_TEXT=QA+Tester) Desktop Support (http://seeker.dice.com/jobsearch/servlet/JobSearch?
ABOUT DICE Company Profile (http://media.dice.com) Contact Sales (http://employer.dice.com/util/contactDice_fs.epl) Advertising (http://seeker.dice.com/common/seeker/docs/siteAdvertising.jsp) Dice in the News (http://media.dice.com/diceinthenews/) Social Recruiting (http://www.dice.com/social) Media Contact (http://media.dice.com) The Dice Report (http://media.dice.com/category/thedice-report/)
About Us (http://media.dice.com/category/about-us/)
Contact Us
Work@Dice (http://seeker.dice.com/common/seeker/docs/dice_jobs.jsp)
(http://www.dice.com) Copyright 1990 - 2014 Dice. All rights reserved. Use of this site is subject to certain Terms and Conditions
(http://seeker.dice.com/common/seeker/docs/terms_and_conditions.jsp)
(http://clicktoverify.truste.com/pvr.php?page=validate&url=www.dice.com&sealid=101)