Apache Spark 1.5 is released and now available to download
http://spark.apache.org/downloads.html
Major features included-
- First actual implementation of Project Tungsten
- Change in code generation
- performance improvement with Project Tungsten-
- As in my previous post , Spark introduced new visual for analyzing SQL and Dataframe .
- Improvements and stability in Spark Streaming in the sense they actually tried to make batch processing and streaming closer. Python API for streaming machine learning algorithms: K-Means, linear regression, and logistic regression.
- Include streaming storage in web UI. Kafka offsets of Direct Kafka streams available through Python API. Added Python API for Kinesis , MQTT and Flume.
- Introduction for more algorithms for Machine learning and Analytics. Added more python Api for distributed matrices, streaming k-means and linear models, LDA, power iteration clustering, etc.
Find the release notes for Apache Spark -
And now its time to use it more & actually use the Python API :-)
For more details, check https://www.linkedin.com/pulse/apache-spark-15-released-abhishek-choudhary
No comments:
Post a Comment