Thursday, September 10, 2015

Apache Spark 1.5 Released

Apache Spark 1.5 is released and now available to download 



http://spark.apache.org/downloads.html



Major features included-

  • First actual implementation of Project Tungsten  

  • Change in code generation 
  • performance improvement with Project Tungsten- 
  • As in my previous post , Spark introduced new visual for analyzing SQL and Dataframe .
  • Improvements and stability in Spark Streaming in the sense they actually tried to make batch processing and streaming closer. Python API for streaming machine learning algorithms: K-Means, linear regression, and logistic regression.

  • Include streaming storage in web UI. Kafka offsets of Direct Kafka streams available through Python API. Added Python API for Kinesis , MQTT and Flume.
  • Introduction for more algorithms for Machine learning and Analytics. Added more python Api for distributed matrices, streaming k-means and linear models, LDA, power iteration clustering, etc.
  • Find the release notes for Apache Spark -
    And now its time to use it more & actually use the Python API :-)

Sunday, September 6, 2015

Apache Spark 1.5 ,interesting new SQL tab in UI


As I was just exploring Apache Spark 1.5 developer version and checking new features, found an interesting new tab named sql :-).



Check details in the following -

https://www.linkedin.com/pulse/apache-spark-15-interesting-new-sql-tab-ui-abhishek-choudhary