64-Bit JMP for Linux

 in
64-bit Linux represents a milestone in JMP statistical computing history.
Design of Experiments Revolutionized the Science of Statistics

One particular analytical strength of JMP sets it clearly apart from all other stats packages, open source or otherwise: its capabilities for design of experiment (DOE). The idea of experimental design goes beyond the traditional statistical concept of trying to learn from data that has already been collected, to an idea of planning how best to learn more by designing and running experiments.

Suppose you make tires by cooking various ingredients into rubber, combining that rubber with steel and other materials, then forming the rubber into tire-shaped molds of various dimensions and tread patterns. You might judge the success of those tires by measuring their traction in a variety of driving conditions, the range of operating temperatures and speeds they can endure, and the length of their usable lifetimes. You would be attempting to solve an impossibly complex problem. You would be trying to optimize at least four response measurements of varying importance and interactions that depend on an infinitely variable mixture of ingredients and countless combinations of cooking temperatures and times, molding pressures and times, tire dimensions and tread patterns. If you tried to optimize this manufacturing process using traditional methods, trying out every imaginable combination, you would be working for centuries and expending more natural resources than you could hope to collect.

Design of experiments offers a better way. By running a representative set of experiments that span the space of possibilities and using statistical modeling methods to interpolate and extrapolate those results, researchers can reduce the size of problems to something manageable.

The Next Level of Design of Experiments

The problem is that analysts always had to leaf through books searching for a design model that resembled the problem they were trying to solve. In practice, they would have to use a model that was sort of like their problem, that sort of handled their conditions and that sort of modeled the behaviors of their system. To make matters worse, most of these “canned designs” required huge numbers of runs. If a run costs a few cents and takes minutes, that's no problem, but if your experiment involves building or changing a multimillion-dollar semiconductor fabrication complex or shutting down a thousand-unit-per-minute assembly line, you cannot hope to do all the runs it would take to get meaningful results.

JMP takes DOE to a whole new level by providing unique, powerful custom design capabilities. Researchers can describe their problem precisely and fully, and JMP can determine smaller numbers of runs that will be sufficient. JMP's unique graphical factor profilers enable researchers to explore the result space over any combination of responses and factors and ultimately maximize desirabilities for their entire system in seconds. JMP's DOE has allowed blue-chip customers to discover million-dollar annual savings or profit opportunities in mere weeks. As Bradley Jones, Senior Manager of Statistical Development and chief architect of JMP's DOE capability, says, “Of all the statistical methods invented in the last 100 years, design of experiments is the most cost-beneficial.”

64-Bit Linux Powers JMP's Latest Innovations

But JMP's unique power in areas, such as design of experiments and its newly introduced restricted maximum likelihood estimation of general linear models, are computationally intensive in the extreme. With JMP's new multithreaded architecture running on a 64-bit dual processor, experimental designs that once took several days to compute can now be calculated in minutes. And, problems that were impossible only last year can now be handled in minutes with 64-bit Linux JMP.

Porting JMP to 64-Bit Addressing

“When porting a 32-bit application to 64 bits, there are certain pitfalls you are likely to encounter”, says Potter. He continues:

The complexity is compounded if you support not only Linux but Windows and Macintosh as well, as we do with JMP. The key thing to remember is that in the 64-bit Linux architecture, pointers and longs are both 64-bits wide, while an int remains 32 bits. This breaks any code that assumes a pointer can be stored in an int.

If you have source code compiled on both Linux and Windows, you must be extra careful. Although it's perfectly legal to store a pointer in a long on 64-bit Linux, converting a pointer to a long won't work on Windows, where a long is still only 32-bits wide. On Windows any 64-bit pointers will have to be converted to long longs instead.

Until now floating-point operations on PowerPC for Macintosh used a 64-bit long double, but that may change in the future. PowerPC-based Macs use the “LP64” model in 64-bit executables, meaning that longs and pointers are 64-bit, just as they are on Linux. Apple has not yet announced a 64-bit strategy for Intel-based Macs.

Finally, although graphical user interface APIs are available to 64-bit applications on Linux and Windows, the Macintosh GUIs are not. Of course, you can separate your application into a 32-bit GUI that communicates with a 64-bit kernel, as Wolfram Research did with 64-bit Mathematica.

Nelson observed that porting JMP to 64 bits went smoothly. “The only issue was locating the few places where our code made assumptions about 32-bit word and pointer sizes”, says Nelson. “There were no surprises from the tools used for the port, either. Much of the ease in porting is due to the long heritage on 64-bit UNIX platforms of the open-source tools we used.”

“The GNU Compiler Collection, gdb, Emacs and the rest of the toolchain performed as expected on the x86_64 architecture”, says Nelson.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

How does this differ from an advertisement?

hjmangalam's picture

While I do read the advertising in LJ and appreciate the support of those companies who support Linux, the presentation of this article seems odd. It's essentially an advertisement for a commercial product written as an article. Unlike previous such articles (such as one explaining the ATA over ethernet protocol that was written by an employee of the only comany that was shipping such a product) neither the product nor the protocol is open source nor available for free.

The only reference to alternatives was the backhanded reference to the excellent (and open source) R language: http://www.r-project.org/ which also has an exceptionally well-developed bioinformatics arm, the Bioconductor project: http://www.bioconductor.org.

This is exactly the sort of unconditional and one-sided article that I expect NOT to find in LJ.

Hopefully, we'll be hearing about R and the Bioconductor project in at least the same depth as this product blurb?

Harry

Webcast
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers

Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.

Learn More

Sponsored by AMD

White Paper
Red Hat White Paper: Using an Open Source Framework to Catch the Bad Guy

Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6

Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.

Learn more about catching the bad guy in this free white paper.

Learn More

Sponsored by DLT Solutions