Spark Release 1.2.0
Material Design, 'cards' have become a common part of modern web design. Simple Header and Footer for CodePen pens. How Keyframes Work Jan 14, 2019 We rolled out a new site design on January 1! This is the 17th version of CSS-Tricks if you can believe that. Different product images on collections. CodePen now features the work of more than 1. Mar 20, 2014 Sparkbox 1.2.3 - helps you manage images for design projects. Unlike iPhoto, it catalogues all of your images, not just the photographs you've taken yourself. Sparkbox helps to share this creativity through organized content. That sharable content is what we call Spark. The user may have several activities ready but will need to make a note on when to do them. This is where Sparkbox has introduced the concept of creating Tasks.
Spark 1.2.0 is the third release on the 1.X line. This release brings performance and usability improvements in Spark’s core engine, a major new API for MLlib, expanded ML support in Python, a fully H/A mode in Spark Streaming, and much more. GraphX has seen major performance and API improvements and graduates from an alpha component. Spark 1.2 represents the work of 172 contributors from more than 60 institutions in more than 1000 individual patches.
To download Spark 1.2 visit the downloads page.
Spark Core
In 1.2 Spark core upgrades two major subsystems to improve the performance and stability of very large scale shuffles. The first is Spark’s communication manager used during bulk transfers, which upgrades to a netty-based implementation. The second is Spark’s shuffle mechanism, which upgrades to the “sort based” shuffle initially released in Spark 1.1. These both improve the performance and stability of very large scale shuffles. Spark also adds an elastic scaling mechanism designed to improve cluster utilization during long running ETL-style jobs. This is currently supported on YARN and will make its way to other cluster managers in future versions. Finally, Spark 1.2 adds support for Scala 2.11. For instructions on building for Scala 2.11 see the build documentation.
Spark Streaming
This release includes two major feature additions to Spark’s streaming library, a Python API and a write ahead log for full driver H/A. The Python API covers almost all the DStream transformations and output operations. Input sources based on text files and text over sockets are currently supported. Support for Kafka and Flume input streams in Python will be added in the next release. I want to take screenshot. Second, Spark streaming now features H/A driver support through a write ahead log (WAL). In Spark 1.1 and earlier, some buffered (received but not yet processed) data can be lost during driver restarts. To prevent this Spark 1.2 adds an optional WAL, which buffers received data into a fault-tolerant file system (e.g. HDFS). See the streaming programming guide for more details.
MLLib
![Sparkbox 1 2 3 Sparkbox 1 2 3](https://i.ebayimg.com/images/g/98QAAOSwjsldS3v1/s-l500.jpg)
Spark 1.2 previews a new set of machine learning API’s in a package called spark.ml that supports learning pipelines, where multiple algorithms are run in sequence with varying parameters. This type of pipeline is common in practical machine learning deployments. The new ML package uses Spark’s SchemaRDD to represent ML datasets, providing direct interoperability with Spark SQL. In addition to the new API, Spark 1.2 extends decision trees with two tree ensemble methods: random forests and gradient-boosted trees, among the most successful tree-based models for classification and regression. Finally, MLlib’s Python implementation receives a major update in 1.2 to simplify the process of adding Python APIs, along with better Python API coverage.
Spark SQL
In this release Spark SQL adds a new API for external data sources. This API supports mounting external data sources as temporary tables, with support for optimizations such as predicate pushdown. Spark’s Parquet and JSON bindings have been re-written to use this API and we expect a variety of community projects to integrate with other systems and formats during the 1.2 lifecycle.
Hive integration has been improved with support for the fixed-precision decimal type and Hive 0.13. Spark SQL also adds dynamically partitioned inserts, a popular Hive feature. An internal re-architecting around caching improves the performance and semantics of caching SchemaRDD instances and adds support for statistics-based partition pruning for cached data.
GraphX
In 1.2 GraphX graduates from an alpha component and adds a stable API. This means applications written against GraphX are guaranteed to work with future Spark versions without code changes. A new core API, aggregateMessages, is introduced to replace the now deprecated mapReduceTriplet API. The new aggregateMessages API features a more imperative programming model and improves performance. Some early test users found 20% - 1X performance improvement by switching to the new API.
In addition, Spark now supports graph checkpointing and lineage truncation which are necessary to support large numbers of iterations in production jobs. Finally, a handful of performance improvements have been added for PageRank and graph loading.
Other Notes
- PySpark’s sort operator now supports external spilling for large datasets.
- PySpark now supports broadcast variables larger than 2GB and performs external spilling during sorts.
- Spark adds a job-level progress page in the Spark UI, a stable API for progress reporting, and dynamic updating of output metrics as jobs complete.
- Spark now has support for reading binary files for images and other binary formats.
Upgrading to Spark 1.2
Spark 1.2 is binary compatible with Spark 1.0 and 1.1, so no code changes are necessary. This excludes APIs marked explicitly as unstable. Spark changes default configuration in a handful of cases for improved performance. Users who want to preserve identical configurations to Spark 1.1 can roll back these changes.
spark.shuffle.blockTransferService
has been changed fromnio
tonetty
spark.shuffle.manager
has been changed fromhash
tosort
- In PySpark, the default batch size has been changed to 0, which means the batch size is chosen based on the size of object. Pre-1.2 behavior can be restored using
SparkContext([.. args.. ], batchSize=1024)
. - Spark SQL has changed the following defaults:
spark.sql.parquet.cacheMetadata
:false
->true
spark.sql.parquet.compression.codec
:snappy
->gzip
spark.sql.hive.convertMetastoreParquet
:false
->true
spark.sql.inMemoryColumnarStorage.compressed
:false
->true
spark.sql.inMemoryColumnarStorage.batchSize
:1000
->10000
spark.sql.autoBroadcastJoinThreshold
:10000
->10485760
(10 MB)
Known Issues
A few smaller bugs did not make the release window. They will be fixed in Spark 1.2.1:
- Netty shuffle does not respect secured port configuration. Work around - revert to nio shuffle: SPARK-4837
- java.io.FileNotFound exceptions when creating EXTERNAL hive tables. Work around - set hive.stats.autogather = false. SPARK-4892.
- Exception PySpark zip function on textfile inputs: SPARK-4841
- MetricsServlet not properly initialized: SPARK-4595
Credits
![Sparkbox Sparkbox](https://c1.iggcdn.com/indiegogo-media-prod-cld/image/upload/c_fill,f_auto,h_630,w_1200/v1604601960/cynzffazmyatg8hnvvvh.jpg)
- Aaron Davidson – Improvements in Core; bug fixes in Core and Shuffle; improvement in Core and Shuffle
- Aaron Staple – Improvements in Core, MLlib, and Streaming; new features in PySpark; bug fixes in SQL
- Adam Pingel – Improvement in Core
- Ahir Reddy – Improvements in Core
- Akshat Aranya – Bug fixes in Core
- Alex Liu – Bug fixes in SQL
- Alexander Ulanov – New features in MLlib
- Allan Douglas R. De Oliveira – Improvements in Core
- Anand Avati – Improvement in Core
- Anant Asthana – Improvement in Core, MLlib, and SQL
- Andrew Ash – Documentation and bug fixes in Core
- Andrew Bullen – Bug fixes in MLlib
- Andrew Or – Improvements in Core and YARN; bug fixes in Windows, Core, and YARN; improvement in Core and YARN
- Andy Konwinski – Documentation in Core
- Aniket Bhatnagar – Bug fixes in Core and Streaming
- Ankur Dave – Improvements and bug fixes in GraphX
- Arun Ahuja – Documentation in Core
- Benoy Antony – Bug fixes in Web UI and YARN
- Bertrand Bossy – Bug fixes in Core
- Bill Bejeck – Bug fixes in Core
- Brenden Matthews – Bug fixes in Mesos
- Burak Yavuz – New features in MLlib
- Chao Chen – Improvements and documentation in Core
- Cheng Hao – Test, improvements, new features, bug fixes, and improvement in SQL
- Cheng Lian – Improvements in Core and SQL; test in SQL; new features in SQL; bug fixes in Core and SQL; documentation in Core
- Chester Chen – Bug fixes in YARN
- Chip Senkbeil – New features in Core
- Chirag Aggarwal – Bug fixes in SQL
- Chris Cope – Bug fixes in YARN
- Christoph Sawade – Improvements in MLlib and PySpark
- Cody Koeninger – Improvements in SQL
- Colin Patrick Mccabe – Improvements in Core
- DB Tsai – Improvements and improvement in MLlib
- Dale Richardson – Improvements in Core
- Dan McClary – New features in SQL
- Dan Osipov – New features in EC2
- Daoyuan Wang – Improvements in Core and SQL; new features in SQL; bug fixes in Core and SQL; documentation in Core
- Davies Liu – Improvements in Core, SQL, MLlib, and PySpark; new features in Core, Streaming, PySpark, and MLlib, and PySpark; bug fixes in Streaming, Core, SQL, MLlib, and PySpark; documentation in Core
- Derek Ma – Bug fixes in Core and Streaming
- DoingDone9 – Bug fixes in SQL
- Egor Pahomov – Bug fixes in Core
- Eric Eijkelenboom – Bug fixes in Core
- Eric Liang – Bug fixes in Core and SQL
- Erik Erlandson – Improvements and improvement in Core
- Eugen Cepoi – Improvements in Core
- Fairiz Azizi – Improvements in Core
- Felix Maximilian Moller – Documentation in Core
- Gankun Luo – Bug fixes in SQL
- Grega Kespret – Documentation in Core
- GuoQiang Li – Improvements in Core and MLlib; bug fixes in Core, Web UI, MLlib, and PySpark; improvement in YARN
- Hari Shreedharan – Bug fixes and improvement in Streaming
- Henry Cook – Documentation in Core
- Holden Karau – Documentation in Core; bug fixes in PySpark
- Hong Shen – Improvements in Core
- Hossein Falaki – Bug fixes in Web UI
- Ian Hummel – Improvements in Core
- Jacky Li – Bug fixes in Core
- Jakub Dubovsky – Bug fixes in Core
- Jascha Swisher – Bug fixes in Core
- Jay Vyas – Documentation in Core
- Jeremy Freeman – New features in Streaming and MLlib; bug fixes in Core and PySpark
- Jey Kottalam – Bug fixes in Core
- Jie Huang – Documentation and bug fixes in Core
- Jim Carroll – Improvements and bug fixes in SQL
- Jim Lim – Improvements in Core; bug fixes in YARN
- Jongyoul Lee – Bug fixes in Core and Mesos
- Joseph Bradley – Improvements in MLlib
- Joseph E. Gonzalez – Documentation in Core; bug fixes in GraphX and MLlib
- Joseph K. Bradley – Improvements in Core and MLlib; new features in MLlib and SQL; bug fixes in MLlib; documentation in Core and MLlib
- Josh Rosen – Improvements in Java API, Core, Web UI, and Shuffle; new features in Java API, Core, and Web UI; bug fixes in Core, PySpark, and Streaming; documentation in Core
- Kai Sasaki – Bug fixes in Core
- Kay Ousterhout – Improvements in Core and Web UI; bug fixes in Core and Web UI
- Ken Takagiwa – Documentation in Core
- Kenichi Maehashi – Improvements in Core
- Kevin Mader – Improvements in Java API and Core
- Kousuke Saruta – Improvements in Project Infra, Core, PySpark, YARN, SQL, and Web UI; bug fixes in Core, PySpark, MLlib, YARN, SQL, and Web UI; documentation in Core
- Larry Xiao – Improvements and bug fixes in GraphX
- Li Zhihui – Improvements in Core
- Liang-Chi Hsieh – Improvements in Core; bug fixes in Core and SQL
- Lianhui Wang – Bug fixes in GraphX
- Lijie Xu – Bug fixes in Core and GraphX
- Liquan Pei – Documentation in Core; new features in MLlib and PySpark
- Liu Hao – Bug fixes in Core
- Lu Lu – Improvements in GraphX
- Madhu Siddalingaiah – Documentation in Core
- Manish Amde – Improvements and new features in MLlib
- Marcelo Vanzin – Test in YARN; improvement in Core and YARN; new features in Core; bug fixes in Core and YARN; improvements in Core
- Mario Pastorelli – Documentation in Core
- Mark G. Whitney – Documentation in YARN
- Mark Hamstra – Bug fixes in Core
- Mark Mims – Improvements in Web UI
- Martin Weindel – Documentation in Core and Mesos
- Masayoshi TSUZUKI – Improvements in Windows, Core, and PySpark; bug fixes in Windows, Core, and PySpark
- Matei Zaharia – Improvement in Core and SQL; bug fixes in Core and SQL
- Matthew Cheah – Bug fixes in Core
- Matthew Farrellee – Improvements in Core; new features in PySpark; bug fixes in Core and PySpark
- Matthew Rocklin – Bug fixes in Core
- Matthew Taylor – Bug fixes in SQL
- Michael Armbrust – Improvements in SQL; new features in SQL; bug fixes in Core, PySpark, and SQL; documentation in Core
- Michael Griffiths – Bug fixes in PySpark
- Michelangelo D’Agostino – Improvements in MLlib and PySpark
- Mike Timper – Bug fixes in SQL
- Min Shen – Bug fixes in YARN
- Mingfei Shi – Bug fixes in Core
- Mubarak Seyed – Improvements in Streaming
- NamelessAnalyst – Improvements in GraphX
- Nan Zhu – Bug fixes and Improvements in Core
- Nathan Artz – Documentation in Core
- Nathan Howell – Bug fixes in SQL
- Nicholas Chammas – Improvement in Core; improvements in Project Infra, Core, and EC2; bug fixes in Project Infra, EC2, and SQL; documentation in Core
- Niklas Wilcke – Improvements in MLlib; bug fixes in Core
- Nishkam Ravi – Bug fixes in Core
- Oded Zimerman – Bug fixes in GraphX
- Patrick Wendell – Improvements in Core; bug fixes in Project Infra, Core, and Mesos
- Prashant Sharma – Improvements in Core; bug fixes in Streaming and Core; improvement in Core, YARN, and Streaming
- Praveen Seluka - New feature in Core
- Qiping Li – Improvements and new features in MLlib
- RJ Nowling – Improvements in MLlib; bug fixes in GraphX; documentation in Core
- Ravindra Pesala – Improvements, new features, and bug fixes in SQL
- Raymond Liu – Improvement in Core and Shuffle
- Renat Yusupov – Bug fixes in SQL
- Reno Zhang – Improvements in YARN
- Reynold Xin – Improvements in Core, Shuffle, EC2, and SQL; new features in Project Infra, Core, and EC2; bug fixes in Core and SQL; improvement in Core, Shuffle, and SQL
- Reza Zadeh – Improvements in Core; new features in MLlib; documentation in Core
- Rob O’Dwyer – Improvements in PySpark
- Rong Gu – Improvements in Core
- Rui Li – New features in Java API
- Saisai Shao – Improvements in Streaming; bug fixes in Streaming and Shuffle
- Sandy Ryza – Improvements in Core, MLlib, and YARN; new features in Core; bug fixes in Core and SQL
- Santiago M. Mola – Documentation in Core
- Sean Owen – Improvement in Streaming; improvements in Core and Streaming; new features in Core; bug fixes in Java API, Core, MLlib, and Streaming; documentation in Core
- Shane Knapp – Bug fixes in Core
- Shiti Saxena – Improvement in Core
- Shivaram Venkataraman – Improvements in Core; bug fixes in Core and EC2
- Shixiong Zhu – Test in Core; improvements in Core and Web UI; bug fixes in Core, Web UI, and YARN; documentation in Streaming and Core
- Bai Shou – Improvements and bug fixes in SQL
- Shuo Xiang – New features and bug fixes in MLlib
- Su Yan – Bug fixes in Core
- Sung Chung – Improvements in MLlib
- Surong Quan – Improvements in Streaming
- Takuya UESHIN – Test in SQL; documentation in Core; bug fixes in Core and SQL; improvements in SQL
- Tal Sliwowicz – Bug fixes in Core
- Tathagata Das – Improvements in Core and Streaming; bug fixes in Streaming and Core; improvement in Streaming
- Ted Yu – Bug fixes and improvement in Core
- Thomas Graves – Bug fixes in Core and YARN
- Tianshuo Deng – Bug fixes in Core and Shuffle
- Timothy Chen – Bug fixes in Mesos
- Tingjun Xu – Bug fixes in YARN
- Tomohiko K. – Bug fixes in Core and PySpark; improvement in PySpark
- Uncle Gen – Improvements in GraphX
- Uri Laserson – Improvements in PySpark
- Varadharajan Mukundan – Improvements in Core
- Venkata Ramana Gollamudi – New features and bug fixes in SQL
- Victor Tso – Bug fixes in Core
- Vida Ha – Improvements in SQL; bug fixes in EC2
- Viper Kun – Documentation in Core
- Wang Fei – Test in SQL; improvements in Core and SQL; bug fixes in Core and SQL; documentation in Core
- Wang Tao – Improvements in Core, YARN, and SQL; bug fixes in Core and YARN
- Ward Viaene – Bug fixes in PySpark
- Wenchen Fan – Bug fixes in SQL
- William Benton – Improvements and bug fixes in SQL
- Xiangrui Meng – Improvements in Core, PySpark, MLlib, SQL, Java API, and Web UI; documentation in Core; new features in SQL, MLlib, and PySpark; bug fixes in Core, MLlib, and PySpark; improvement in PySpark, MLlib, and SQL
- Xinyun Huang – Improvements in SQL
- Yadong Qi – Test in Core; improvements and bug fixes in Streaming
- Yanbo Liang – New features in MLlib
- Yantang Zhai – Improvements in Core; bug fixes in Core, Web UI, and SQL
- Yash Datta – Improvements in SQL
- Ye Xianjin – Improvements in Core
- Yin Huai – Documentation in Core; bug fixes in SQL
- Zdenek Farana – Bug fixes in SQL
- Zhan Zhang – Build fixes in SQL
- Zhang, Liye – Improvements and bug fixes in Core
Thanks to everyone who contributed!
Spark News Archive
SPARKBOX The EK Alternative™ is ALL-NEW and designed to be a reliable and “EKonomical” substitute for the popular Wico EK magneto used on antique hit ‘n’ miss and throttle-governed farm engines. Use it as the primary ignition source, a backup in case a magneto fails or as a diagnostic tool to help rule out magneto problems when diagnosing engine performance!
Visit our Youtubechannel farmachinistfor videos of the Sparkbox in action!
FEATURES
- Mounts just like the Wico EK and fits within the original EK profile.
- Reproduction cloth-covered spark plug wire is routed through a custom lead-out tower, lending a vintage appearance.
- Reliable, consistent spark! Powered by a small, external 6-volt battery (we use an inexpensive lantern battery). If you have a good battery, you will have spark!
- Tolerates worn magneto drives and eliminates magneto wear-related issues. Solves common EK complaints like: 'it makes spark on the bench test, but not on the engine” and 'works on one engine, but not on the other.'
- All-new, off-the-shelf electrical components. Simple coil and condenser type circuit. A micro switch eliminates corroded breaker points. Adjustments or repairs are simple and inexpensive!
- Works with all magneto drive types: Type 1, 2 and 3. Ignition timing is performed according to the engine manufacturer’s instructions.
- Tough, oil-resistant power cord.
- Safety shutoff switch.
- Detailed, illustrated 9-page owner's manual.
- Limited WARRANTY included- request details!
More details below!
Mounts just like the EK and fits within the original magneto profile dimensions. | 12' long, braided-cloth covered spark wire with attached spark plug clip. Longer wire lengths available at small extra cost. |
Convenient shut-off switch. Heavy duty, oil-resistant power cord. | Designed to work with all antique farm engines that use the Wico EK magneto, both hit-n-miss or throttle governed. |
Sparkbox 1 2 5
YOU WILL NEED:
- 6 volt battery. We use an inexpensive dry-type lantern battery.
- Proper length plug wire. When ordering, specify the total length spark plug wire you need. Though it can be replaced by the owner, we install the plug wire here during assembly as it is retained internally and more securely than the original EK.
12' standard length included. Longer wire at additional charge, see below.
- Battery clips to attach the power cord to a battery.
We can install optional clips for extra. See below.
- Hardware of appropriate length (5/16-18 thread) to attach the
Sparkbox to your engine. Details in owner's manual.
IMPORTANT NOTES:
Spark wire outlet: Plug wire exits the cover on the left hand side as shown
and is not reversible by the user as with the original EK.
A right hand side version is available at additional charge
and extra lead time.
Type 2 Magneto Drive: Timing retard may not work unless a lighter drive spring
is installed. We offer a spring for the Type 2 at a small
additional cost.
Type 3 Magneto Drive: Timing retard may also require a lighter drive spring for
the Type 3. More details are in the owner's manual.
McCormick model M engines:
As with the EK, the SPARKBOX is mounted using one bolt and a special stud.
We offer a lower mounting stud to fit the M bracket at additional cost.
NOVO engines:
We offer the actuator (armature) clevis turned at 90 degrees to fit the Novo
drive like original. Special order, details below.
Internal Battery:
There's simply no room left inside to fit a battery.
Using original EK cover plates:
Although the Sparkbox overall dimensions and fit are the same as the EK, the
design requires a unique frame and covers, so original EK covers will not fit.
Base price$350 plus S/H
Canadian & Overseas orders- write for details.
Iowans must add 7% sales tax.
Request total before sending payment!
Please email or call first to check availability. Prices subject to change without notice.
- 6 volt battery. We use an inexpensive dry-type lantern battery.
- Proper length plug wire. When ordering, specify the total length spark plug wire you need. Though it can be replaced by the owner, we install the plug wire here during assembly as it is retained internally and more securely than the original EK.
12' standard length included. Longer wire at additional charge, see below.
- Battery clips to attach the power cord to a battery.
We can install optional clips for extra. See below.
- Hardware of appropriate length (5/16-18 thread) to attach the
Sparkbox to your engine. Details in owner's manual.
IMPORTANT NOTES:
Spark wire outlet: Plug wire exits the cover on the left hand side as shown
and is not reversible by the user as with the original EK.
A right hand side version is available at additional charge
and extra lead time.
Type 2 Magneto Drive: Timing retard may not work unless a lighter drive spring
is installed. We offer a spring for the Type 2 at a small
additional cost.
Type 3 Magneto Drive: Timing retard may also require a lighter drive spring for
the Type 3. More details are in the owner's manual.
McCormick model M engines:
As with the EK, the SPARKBOX is mounted using one bolt and a special stud.
We offer a lower mounting stud to fit the M bracket at additional cost.
NOVO engines:
We offer the actuator (armature) clevis turned at 90 degrees to fit the Novo
drive like original. Special order, details below.
Internal Battery:
There's simply no room left inside to fit a battery.
Using original EK cover plates:
Although the Sparkbox overall dimensions and fit are the same as the EK, the
design requires a unique frame and covers, so original EK covers will not fit.
Base price$350 plus S/H
Canadian & Overseas orders- write for details.
Iowans must add 7% sales tax.
Request total before sending payment!
Please email or call first to check availability. Prices subject to change without notice.
Calculator
Additional Items to Order:
Spark plug wire (required):Order total desired length.
12' is included.Add $.40 for each additional inch.
Battery clips (installed), suitable for 6V lantern battery: $6.00
Type 2, lighter drive spring for spark retard: $5.00
McCormick M lower mounting stud:$14.00
These options are special order and require extra lead time:
Right-hand side spark wire outlet:$40.00
NOVO engines: actuator (armature) clevis turned 90 degrees from standard:$20.00
Spark plug wire (required):Order total desired length.
12' is included.Add $.40 for each additional inch.
Battery clips (installed), suitable for 6V lantern battery: $6.00
Type 2, lighter drive spring for spark retard: $5.00
McCormick M lower mounting stud:$14.00
These options are special order and require extra lead time:
Right-hand side spark wire outlet:$40.00
NOVO engines: actuator (armature) clevis turned 90 degrees from standard:$20.00