Ajay’s posterous

Happy 2010

HAPPY

NEW

2@1@

Loading mentions Retweet

Comments [0]

Harvard DropOut Writes Open Letter- His Startup has 350m users



DECISION STATS - Harvard DropOut Writes Open Letter- His Startup has 350m users

Harvard DropOut Writes Open Letter- His Startup has 350m users






Note from Mark “zucken”berg

An Open Letter from Facebook Founder Mark Zuckerberg
by Mark Zuckerberg Yesterday at 9:23pm

It has been a great year for making the world more open and connected. Thanks to your help, more than 350 million people around the world are using Facebook to share their lives online.

To make this possible, we have focused on giving you the tools you need to share and control your information. Starting with the very first version of Facebook five years ago, we’ve built tools that help you control what you share with which individuals and groups of people. Our work to improve privacy continues today.

Facebook’s current privacy model revolves around “networks” — communities for your school, your company or your region. This worked well when Facebook was mostly used by students, since it made sense that a student might want to share content with their fellow students.

Over time people also asked us to add networks for companies and regions as well. Today we even have networks for some entire countries, like India and China.

However, as Facebook has grown, some of these regional networks now have millions of members and we’ve concluded that this is no longer the best way for you to control your privacy. Almost 50 percent of all Facebook users are members of regional networks, so this is an important issue for us. If we can build a better system, then more than 100 million people will have even more control of their information.

The plan we’ve come up with is to remove regional networks completely and create a simpler model for privacy control where you can set content to be available to only your friends, friends of your friends, or everyone.

We’re adding something that many of you have asked for — the ability to control who sees each individual piece of content you create or upload. In addition, we’ll also be fulfilling a request made by many of you to make the privacy settings page simpler by combining some settings. If you want to read more about this, we began discussing this plan back in July.

Since this update will remove regional networks and create some new settings, in the next couple of weeks we’ll ask you to review and update your privacy settings. You’ll see a message that will explain the changes and take you to a page where you can update your settings. When you’re finished, we’ll show you a confirmation page so you can make sure you chose the right settings for you. As always, once you’re done you’ll still be able to change your settings whenever you want.

We’ve worked hard to build controls that we think will be better for you, but we also understand that everyone’s needs are different. We’ll suggest settings for you based on your current level of privacy, but the best way for you to find the right settings is to read through all your options and customize them for yourself. I encourage you to do this and consider who you’re sharing with online.

Thanks for being a part of making Facebook what it is today, and for helping to make the world more open and connected.

Well atleast he can write open code.

http://decisionstats.wordpress.com/2009/12/02/harvard-dropout-writes-open-letter-his-startup-has-350m-users/;Harvard+DropOut+Writes+Open+Letter-+His+Startup+has+350m+users;3015914" target="_blank">Email to a friend • Article Searchhttp://decisionstats.wordpress.com/2009/12/02/harvard-dropout-writes-open-letter-his-startup-has-350m-users/" title="See related articles to this one based on reader votes" target="_blank">Related • View commentsTrack commentswww.decisionstats.com%2f%3ffeed%3drss2&permalink=http%3a%2f%2fdecisionstats.wordpress.com%2f2009%2f12%2f02%2fharvard-dropout-writes-open-letter-his-startup-has-350m-users%2f&src=5" target="_blank">



     

More Recent Articles

  http://www.asterdata.com                                                                     M2009



 

 


www.decisionstats.com%2f%3ffeed%3drss2" border="0" alt="" />

Click here to safely unsubscribe now from "DECISION STATS" or change your subscription or subscribe www.feedblitz.com" src="http://www.feedblitz.com/logos/3015914/442017/10170215/logo.gif" border="0" height="16" align="middle" width="51" />

Your requested content delivery powered by FeedBlitz, LLC, 9 Thoreau Way, Sudbury, MA 01776, USA. +1.978.776.9498

 

Loading mentions Retweet

Comments [0]

DECISION STATS - 9 new articles

Websites-
http://decisionstats.com
http://dudeofdata.com
http://prayers2go.com

Linkedin- www.linkedin.com/in/ajayohri
Facebook-www.facebook.com/ajayohri
Twitter-www.twitter.com/dudeofdata

Quote for the Day-


---------- Forwarded message ----------
From: FeedBlitz <feedblitz@mail.feedblitz.com>
Date: Sun, Dec 6, 2009 at 12:33 AM
Subject: DECISION STATS - 9 new articles
To: ohri2007 <ohri2007@gmail.com>


Your email updates, powered by FeedBlitz

 
Here are the FeedBlitz email updates for ohri2007@gmail.com



DECISION STATS - 9 new articles

Protected: Seperating the Church and the State






This post is password protected. To view it please enter your password below:




     

Why I am dropping out of University






I am dropping out of the University of Tennessee

because

MY Professor Dr Frank Guess calls me Curious George by email

and Department Head Dr KENNETH GILBERT tells me not to write poetry on my blog OTHERWISE H ARM will come to

Mee.



     

Things that CHANGE in December 2009







Please click on images themselves for the direct source- All Rights Ack now ledge D unless they are owned by Ass O CIA te Press LI mited



     

Data Mining Presentation at M2009 by Dr Vincent Granville


Here is a Data Mining Presentation by Dr Vincent Granville at M 2009.

I could not see the presentation as I know him only through remote internet communication, but he did recommend me on LinkedIn and for my application for the University of Tennessee.

here is the presentation on the docs on Google DOCs

http://docs.google.com/fileview?id=0B9YMMvghK2ytM2IyMzQzYjUtY2VlOC00ZmQ1LWJjZWItZjU2YjdmMTRlYTc4&hl=en

ps-

Hey Doc,
How’s Paris doing.
Ajay

pps-

The download is also available here.

Granville(2)

ppps- He is the same person whom I worked last winter as a research ass is tant and got paid 28 cents per 1000 rows for all the statisticians in the world – read more about that here

http://decisionstats.wordpress.com/2009/11/30/weak-security-in-internet-databases-for-statisticians/



     

Christ Mass Tree in TENNESSEE


Here is the image of a Tree in Fall in Tennessee.

Please kill it so you can decorate your houses for Christmas and then release an ad on how environmental friendly

you

your company

your university is.



     

My First You Tube Video: Courtesy the competiton on VOLNIGHT by Univ of Tennessee


Here is a demonstration Video created by BallRoom Mania , the social dance club of University of Tennessee whose President is Suzzane Devan.Also starring J>T> Fuellen UT grad and english major.And starring meJT is sterotype black hip hop guy/ HUHSuzzane is stereotype white blond who giggles



     

Best Internet Site of 2009


Here is the best internet site of 2009.
It basically shows how many jobs have been created per dollar spent.
Funded by the debt of American Treasuries
sold to Chinese.

Remember the Chinese Opium Wars.
Well the Chinese are hooked to American Treasuries and they probably need a Warship with Admiral to open their markets and currency. Oui!

Well anyway the website is called http://Recovery.gov

http://decisionstats.wordpress.com/2009/12/05/best-internet-site-of-2009/;Best+Internet+Site+of+2009;3037043" target="_blank">Email to a friend • Article Searchhttp://decisionstats.wordpress.com/2009/12/05/best-internet-site-of-2009/" title="See related articles to this one based on reader votes" target="_blank">Related • View commentsTrack commentswww.decisionstats.com%2f%3ffeed%3drss2&permalink=http%3a%2f%2fdecisionstats.wordpress.com%2f2009%2f12%2f05%2fbest-internet-site-of-2009%2f&src=5" target="_blank">



     

Ohri Principle- for Predictive Analytics for Events using TEx T mining

http://decisionstats.wordpress.com/2009/12/05/thesis-using-text-mining-linguistics-for-rare-event-prediction-and-analysis-a-tool-for-decision-management/;Ohri+Principle-+for+Predictive+Analytics+for+Events+using+TEx+T+mining;3037043" target="_blank">Email to a friend • Article Searchhttp://decisionstats.wordpress.com/2009/12/05/thesis-using-text-mining-linguistics-for-rare-event-prediction-and-analysis-a-tool-for-decision-management/" title="See related articles to this one based on reader votes" target="_blank">Related • View commentsTrack commentswww.decisionstats.com%2f%3ffeed%3drss2&permalink=http%3a%2f%2fdecisionstats.wordpress.com%2f2009%2f12%2f05%2fthesis-using-text-mining-linguistics-for-rare-event-prediction-and-analysis-a-tool-for-decision-management%2f&src=5" target="_blank">

mapmapmapmap

Loading mentions Retweet

Comments [0]

Interview Karl Rexer -Rexer Analytics — DecisionStats

Here is an interview with Karl Rexer of Rexer Analytics. His annual survey is considered a benchmark in the data mining and analytics industry. Here Karl talks of his career, his annual survey and his views on the industry direction and trends.

Loading mentions Retweet

Comments [0]

DecisionStats - Interview Paul van Eikeren Inference for R








DecisionStats - Interview Paul van Eikeren Inference for R

Interview Paul van Eikeren Inference for R

Here is an interview with Paul van Eikeren, President and CEO of Blue Reference, Inc. Paul heads up a startup company addressing the need of information workers to have easier-cheaper-faster access to high-end data mining, analysis and reporting capabilities from software like R, S-plus, MATLAB, SAS, SPSS, python and ruby. His recent product Inference for R has been causing waves within the analytical fraternity across both R users and SAS users, especially given the fact that it is quite well designed, has a great GUI, and is priced rather reasonably.

A few weeks ago, rumour had it the SAS Institute was reportedly buying out the Inference for R product ( Note the merger and acquisition question below)

Rather curious to know about this company, I happened to met Ben Hincliffe at the www.analyticbridge.com site which with 5000 members has the largest number of data analytics and many business intelligence members as well). Ben who recently authored a guest post for Sandro at Data Mining Blog then put across my request to interview with Paul, the CEO for Blue Reference. Existing products for Blue Reference include additional analytical packages like Inference for Matlab etc.

Paul is an extremely seasoned person with years in the analytical fraternity and with a Phd from MIT. Here is Paul’s vision on his company and analytics product development.
pve1

Ajay: Describe your career journeys. What advice would you give to today’s young people of following careers in science.

Paul: I have been blessed with extremely productive and diversified career journey. After receiving undergraduate and graduate degrees in chemistry, I taught chemistry and carried out research as a college professor for 14 years. During the next 12 years I spend heading R&D teams at three different startup companies focused on the application of novel processing technology for use in drug discovery and development. And using that wealth of acquired experience, I have had the good fortune to successfully co-found and develop with my son Josh, two startup companies (IntelliChem and Blue Reference) directed at the use of informatics to drive more efficient and effective Research, Development, Manufacturing and Operations.

In my journey I have had the opportunity to counsel many young people regarding their career choices. I have offered two principal pieces of advice: one, for the right person, science represents an outstanding opportunity for a productive and satisfying career; and two, a science education provides an outstanding stepping stone to careers in other fields. A study disclosed in a recent Wall Street Journal article (Sarah E. Needleman, “Doing the Math to Find the Good Jobs, 26 January 2009) revealed that mathematicians land the top spot in the new rankings of the best occupations. Science-linked occupations took 7 out of the top 20 spots.

These ratings suggest that the problem solving and innovation aspects of scientific occupations are much less stressful than other occupations, which leads to high job satisfaction. But does one have to be a genius to have a successful career in science? An interesting read on this subject is the book by Robert Weisberg (Creativity: Beyond the Myth of the Genius) in which he dispels the myth of the genius being the results of a genetic gift. Weisberg argues, convincingly, that a genius exhibits three elements: (1) a basic intellectual capacity; (2) a high level of motivation/determination, which enables the genius to remain focused; and (3) immersion in their chosen field, typically represented by over 10,000 hours of study/practice/experience. It turns out that the latter element is the principal differentiator, and fortunately, it is something one has control over.

Ajay: Describe the journey that Blue Reference has made leading to its current product line, including Inference for R.

Paul: The Inference product suite represents a natural extension beyond the Electronic Laboratory Notebook (ELN) product we developed at our previous company, IntelliChem. ELNs are used by scientists and technicians to document research, experiments and procedures performed in a laboratory. The ELN is a fully electronic replacement of the paper notebook. IntelliChem (sold to Symyx in 2004) was a leader in deployment of ELNs at global pharmaceutical companies.

After seeing the successful adoption of ELNs in the laboratory, we saw an opportunity to improve upon the utility of ELN documents and the data contained therein. Essentially, we developed Inference to be a platform for enabling MS Office documents with powerful, flexible, and transparent analytic capabilities - what we call “dynamic documents” or “document mashups”. Executable code from high-level scripting languages like R, MATLAB, and .NET, is combined with data and explanatory text in the document canvas to transform it from a static record into an analytic application.

The pharmaceutical industry, in cooperation with the FDA, has begun to look at ways to implement quality by design (QbD) practices as an alternative to quality by end-testing. QbD comprises a systematic application of predictive analytics to the drug R&D process such that development timelines and costs are reduced while drug safety and efficacy is improved.

Statistical modeling and analysis plays a key role in QbD as a tool for identifying critical quality attributes and confining their variability to a specified design space. Dynamic documents fit nicely into this paradigm, and we’re currently using Inference as a platform to develop an enterprise solution for QbD. You can visit www.InferenceForQbD.com for more information about our QbD product.

Along the way, we recognized the need for Inference outside of the pharmaceutical industry. The Inference for R, Inference for MATLAB, and Inference for.NET versions are meant to serve users of these technical computing languages who have analysis, publishing, reporting, collaboration, and reproducible research needs that are best served by a document centric environment. By using Microsoft Word, Excel and PowerPoint as the “front end,” we can serve the the 500 million users that use Microsoft Office as their principal desktop productive application.

Ajay: What is the pricing strategy for Inference for Matlab and Inference for R - and how do you see the current recession as an opportunity for analytical products.

Paul: Our strategy is to reach out to the market Microsoft Office users that would benefit from easy access to datamining and predictive analytics capabilities within their principal desktop productivity tool. Accordingly, we have offered the Inference product at the low price of $199 for a single user/one year subscription. Additionally, because it is implemented on top of an existing installation of Microsoft Office, the cost of training, support and maintenance are expected to be minimal.

create-a-simple-user-interface-for-your-r-application
create-a-simple-user-interface-for-your-r-application
r-code-directly-in-excel-to-customize-your-analysis
r-code-directly-in-excel-to-customize-your-analysis
graphical-output-in-an-excel-tab
graphical-output-in-an-excel-tab

Ajay: Your product seems to follow a nice fit where both open source as well as proprietary packages from Microsoft( .Net) are working together to give the customer a nice solution. Do you believe it is possible that big companies and big open source communities can work together to create some software rather than just be at loggerheads.

Paul: Absolutely. We’re seeing momentum build for open source analytic solutions as the economy impacts companies, both small and large. We saw this take place in the back office with implementation of Linux and Apache Web servers, and now we’re starting to see it in the front office. Smart IT teams are looking for creative ways to stretch their resources, forcing them to look beyond established, but expensive, software products.

We’ve encountered concrete evidence of this in the financial industry. Fresh on the heels of the credit crisis, investment banks and hedge funds have begun to realize that their risk models and supporting software infrastructure are inadequate. In response, quantitative finance and risk analysts are increasingly turning to the open source R statistical computing environment for improved predictive analytics.

R has a core group of devotees in academia that drive innovation, making it a comprehensive venue for development of leading-edge data analysis methods. In order to leverage these tools, banks need a way

for R to play nicely with their existing personnel and IT infrastructure. This is where Inference for R produces real value. It transforms MS Office into platform for the development, distribution, and maintenance of R based quantitative tools - enabling production level predictive analytics.

Commercial distributions of R address issues of scalability and support, which might otherwise be subjects of concern. For example, REvolution Computing distributes an optimized, validated and supported distribution of R, providing peace of mind to corporate IT. REvolution also offers Enterprise R, a distribution of R for 64-bit, high performance computing.

Ajay: Please name any successful customer testimonials for Inference for R.

Paul: We have been working with the director of quantitative analytics at a large international bank. He reported that he has successfully distributed R applications to his team of research analysts and portfolio managers based on Inference in Excel. Use of this strategy eliminated the need to code complex models in Visual Basic for applications, which is time consuming and error prone.

Ajay: Also are there any issues with licensing and IP for mixing open source code and proprietary code.

Paul- The licensing issues with open source R pertain to distributing R. There are no licensing restrictions in using R. Accordingly, we do not distribute R. Rather, our customers install R separately and Inference recognizes the installation.

Ajay: So R is free and I can get Open Office for free. What are the five specific uses where Inference for R can score an edge over this and make me pay for the solution.

Paul: R is free, and many R enthusiasts would argue that all you need for R is a Linux operating system like Ubuntu, a text editor such as Emacs, and R’s command line interface. For some highly-skilled R users this is sufficient; for the new and average R user this is a nightmare.

Many people think that the largest fraction of the cost of implementing new software is the cost of the license. In actuality, and especially in the corporate world, it is the cost of training, user support, software maintenance, and the costs of switching the user base to the new software. Free open source software does not help here. Hence there is a strong ROI argument to be made to build new software application on top of existing systems that have worked well.

Additionally, successful implementation of open source software like R requires a baseline of integration with existing systems. The fact is that Microsoft operating systems dominate the business world, as does Microsoft Office. If one is serious about using R to address the analytic needs of big business, tight integration with these systems is imperative.

Ajay: Any plans for a web hosted SaaS version for Inference for R soon?

Paul: The natural progression of Inference for R to SaaS will coincide with the next release of Office (Office 2010 or Office 14), which we expect to be largely SaaS enabled.

Ajay: Name some alliances and close partners working with Blue Reference

- and what we can expect from you in terms of product launches in 2009.

Paul: We have created a product development consortium in partnership involving ‘top ten’ global pharmaceutical companies The consortium is guiding the development of an enterprise solution for Quality by Design (QbD), using Inference for R as the platform.

We are working with several consulting firms specializing in IT solutions for specialized markets like risk management and predictive analytics.

We are also working with several technology partners who have complementary products and where integration of their products with Inference provides clear and significant value to customers.

Ajay: Any truth to the rumors of an acquisition by a BIG analytics company?

Paul: Our business strategy is centered on growth through partnerships with others. Acquisition is one means to execute that strategy.

Ajay: How do you see this particular product (for R) shaping up down the years.

Paul: R’s success can be attributed, in large part, to the support of its loyal open source community. Its enthusiastic use in academia bodes very well for its growth as a cutting-edge analytics tool. It is just a matter of time before commercial analytic solutions powered by R become de rigueur. We’re happy to be at the tip of the spear.

Ajay: Any Asia plans for Blue Reference or are you still happy with the Oregon location. How do you plan to interact with graduate schools and academia for your products.

Paul: Although we don’t have a major private university in our backyard, Oregon State University has opened a campus here. And, we’ve been in dialogue with the global Academic community from day one. Over 100 academic institutions around the world use Inference through our academic licensing program. Inference is a great tool for preparing dynamic lessons and publishing reproducible research.

Our Central Oregon location is home to a growing high-tech sector that we’ve been a part of for decades. We’ve had success building large and profitable companies here. Bend attracts Silicon Valley types who come here for vacation and don’t want to leave - they just can’t seem to resist the quality of life and bountiful recreational opportunities that this area offers. It’s a good mix of work and play.

Biography

Paul van Eikeren is President and CEO of Blue Reference, Inc. He is responsible for guiding the strategic direction of the company through novel products and services development, partnerships and alliances in the realm of application of informatics to faster-cheaper-better research, development, manufacturing and operations. Van Eikeren is a successful serial entrepreneur, which includes the co-founding of IntelliChem with his son Josh and its ultimate sale to Symyx Technologies. He has headed up R&D at several startup companies focused on drug discovery and development including Sepracor Inc., Argonaut Technologies, Inc, and Bend Research, Inc. He served as Professor of Chemistry and Biochemistry at Harvey Mudd College of Science and Engineering. He is author/co-author and inventor/co-inventor in over 50 scientific articles and patents directed at the application of chemical, biochemical and computational technologies. Van Eikeren holds a BA degree in Chemistry from Columbia University and a PhD in Chemistry from MIT.bluereference-logo

Ajay- To know more I recommend checking out the free evaluation at http://inferenceforr.com/ especially if you need to rev up your MS office Installation with greater graphics and analytics juice.

Share/Save


www.decisionstats.com%2f%3ffeed%3drss2" rel="NOFOLLOW" shape="rect" coords="0,0,467,59" target="_blank" />



Loading mentions Retweet

Comments [1]

DecisionStats - Interview David Smith REvolution Computing






DecisionStats - Interview David Smith REvolution Computing

Interview David Smith REvolution Computing

Here is an Interview with REvolution Computing’s Director of Community David Smith.

Our development team spent more than six months making R work on 64-bit Windows (and optimizing it for speed), which we released as REvolution R Enterprise bundled with ParallelR.” David Smith - screenshot-dsgif

Ajay -Tell us about your journey in science. In particular tell us what attracted you to R and the open source movement.

David- I got my start in science in 1990 working with CSIRO (the government science organization in Australia) after I completed my degree in mathematics and computer science. Seeing the diversity of projects the statisticians there worked on really opened my eyes to statistics as the way of objectively answering questions about science.

That’s also when I was first introduced to the S language, the forerunner of R. I was hooked immediately; it was just so natural for doing the work I had to do. I also had the benefit of a wonderful mentor, Professor Bill Venables, who at the time was teaching S to CSIRO scientists at remote stations around Australia. He brought me along on his travels as an assistant. I learned a lot about the practice of statistical computing helping those scientists solve their problems (and got to visit some great parts of Australia, too).

Ajay- How do you think we should help bring more students to the fields of mathematics and science-

David- For me, statistics is the practical application of mathematics to the real world of messy data, complex problems and difficult conclusions. And in recent years, lots of statistical problems have broken out of geeky science applications to become truly mainstream, even sexy. In our new information society, graduating statisticians have a bright future ahead of them which I think will inevitably draw more students to the field.

Ajay- Your blog at REVolution Computing is one of the best technical corporate blogs. In particular the monthly round up of new packages, R events and product launches all written in a lucid style. Are there any plans for a REvolution computing community or network as well instead of just the blog.

David- Yes, definitely. We recently hired Danese Cooper as our Open Source Diva to help us in this area. Danese has a wealth of experience building open-source communities, such as for Java at Sun. We’ll be announcing some new community initiatives this summer. In the meantime, of course, we’ll continue with the Revolutions blog, which has proven to be a great vehicle for getting the word out about R to a community that hasn’t heard about it before. Thanks for the kind words about the blog, by the way — it’s been a lot of fun to write. It will be a continuing part of our community strategy, and I even plan to expand the roster of authors in the future, too. (If you’re an aspiring R blogger, please get in touch!)

Ajay- I kind of get confused between what exactly is 32 bit or 64 bit computing in terms of hardware and software. What is the deal there. How do Enterprise solutions from REvolution take care of the 64 bit computing. How exactly does Parallel computing and optimized math libraries in REvolution R help as compared to other flavors of R.

David- Fundamentally, 64-bit systems allow you to process larger data sets with R — as long as you have a version of R compiled to take advantage of the increased memory available. (I wrote about some of the technical details behind this recently on the blog.)  One of the really exciting trends I’ve noticed over the past 6 months is that R is being applied to larger and more complex problems in areas like predictive analytics and social networking data, so being able to process the largest data sets is key.

One common mis perception is that 64-bit systems are inherently faster than their 32-bit equivalents, but this isn’t generally the case. To speed up large problems, the best approach is to break the problem down into smaller components and run them in parallel on multiple machines. We created the ParallelR suite of packages to make it easy to break down such problems in R and run them on a multiprocessor workstation, a local cluster or grid, or even cloud computing systems like Amazon’s EC2 .

” While the core R team produces versions of R for 64-bit Linux systems, they don’t make one for Windows. Our development team spent more than six months making R work on 64-bit Windows (and optimizing it for speed), which we released as REvolution R Enterprise bundled with ParallelR. We’re excited by the scale of the applications our subscribers are already tackling with a combination of 64-bit and parallel computing”

Ajay-  Command line is oh so commanding. Please describe any plans to support or help any R GUI like rattle or R Commander. Do you think Revolution R can get more users if it does help a GUI.

David- Right now we’re focusing on making R easier to use for programmers by creating a new GUI for programming and debugging R code. We heard feedback from some clients who were concerned about training their programmers in R without a modern development environment available. So we’re addressing that by improving R to make the “standard” features programmers expect (like step debugging and variable inspection) work in R and integrating it with the standard environment for programmers on Windows, Visual Studio.

In my opinion R’s strength lies in its combination of high-quality of statistical algorithms with a language ideal for applying them, so “hiding” the language behind a general-purpose GUI negates that strength a bit, I think. On the other hand it would be nice to have an open-source “user-friendly” tool for desktop statistical analysis, so I’m glad others are working to extend R in that area.

Ajay- Companies like SAS are investing in SaaS and cloud computing. Zementis offers scored models on the cloud through PMML. Any views on just building the model or analytics on the cloud itself.

David- To me, cloud computing is a cost-effective way of dynamically scaling hardware to the problem at hand. Not everyone has access to a 20-machine cluster for high-performing computing — and even those that do can’t instantly convert it to a cluster of 100 or 1000 machines to satisfy a sudden spike in demand. REvolution R Enterprise with ParallelR is unique in that it provides a platform for creating sophisticated data analysis applications distributed in the cloud, quickly and easily.

Using clouds for building models is a no-brainer for parallel-computing problems: I recently wrote about how parallel backtesting for financial trading can easily be deployed on Amazon EC2, for example. PMML is a great way of deploying static models, but one of the big advantages of cloud computing is that it makes it possible to update your model much more frequently, to keep your predictions in tune with the latest source data.

Ajay- What are the major alliances that REvolution has in the industry.

David- We have a number of industry partners. Microsoft and Intel, in particular, provide financial and technical support allowing us to really strengthen and optimize R on Windows, a platform that has been somewhat underserved by the open-source community. With Sybase, we’ve been working on combing REvolution R and Sybase Rap to produce some exciting advances in financial risk analytics. Similarly, we’ve been doing work with Vhayu’s Velocity database to provide high-performance data extraction. On the life sciences front, Pfizer is not only a valued client but in many ways a partner who has helped us “road-test” commercial grade R deployment with great success.

Ajay- What are the major R packages that REvolution supports and optimizes and how exactly do they work/help?

David- REvolution R works with all the R packages: in fact, we provide a mirror of CRAN so our subscribers have access to the truly amazing breadth and depth of analytic and graphical methods available in third-party R packages. Those packages that perform intensive mathematical calculations automatically benefit from the optimized math libraries that we incorporate in REvolution R Enterprise. In the future, we plan to work with authors of some key packages provide further improvements — in particular, to make packages work with ParallelR to reduce computation times in multiprocessor or cloud computing environments.

Ajay- Are you planning to lay off people during the recession. does REvolution Computing offer internships to college graduates. What do people at REvolution Computing do to have fun?

David- On the contrary, we’ve been hiring recently. We don’t have an intern program in place just yet, though. For me, it’s been a really fun place to work. Working for an open-source company has a different vibe than the commercial software companies I’ve worked for before. The most fun for me has been meeting with R users around the country and sharing stories about how R is really making a difference in so many different venues — over a few beers of course!


David Smith
Director of Community

David has a long history with the statistical community.  After graduating with a degree in Statistics from the University of Adelaide, South Australia, David spent four years researching statistical methodology at Lancaster University (United Kingdom), where he also developed a number of packages for the S-PLUS statistical modeling environment. David continued his association with S-PLUS at Insightful (now TIBCO Spotfire) where for more than eight years he oversaw the product management of S-PLUS and other statistical and data mining products. David is the co-author (with Bill Venables) of the tutorial manual, An Introduction to R , and one of the originating developers of ESS: Emacs Speaks Statistics. Prior to joining REvolution, David was Vice President, Product Management at Zynchros, Inc.

Ajay - To know more about David Smith and REvolution Computing do visit http://www.revolution-computing.com and

http://www.blog.revolution-computing.com
Also see interview with Richard Schultz ,­CEO REvolution Computing here.

http://www.decisionstats.com/2009/01/31/interviewrichard-schultz-ceo-revolution-computing/

Share/Save


www.decisionstats.com%2f2009%2f05%2f29%2finterview-david-smith-revolution-computing%2f&username=fbz_10170215&numStars=0" rel="nofollow" shape="rect" coords="0,0,1,1" target="_blank" />www.decisionstats.com%2f2009%2f05%2f29%2finterview-david-smith-revolution-computing%2f&username=fbz_10170215&numStars=1" rel="nofollow" shape="rect" coords="2,0,20,28" target="_blank" />www.decisionstats.com%2f2009%2f05%2f29%2finterview-david-smith-revolution-computing%2f&username=fbz_10170215&numStars=2" rel="nofollow" shape="rect" coords="20,0,38,28" target="_blank" />www.decisionstats.com%2f2009%2f05%2f29%2finterview-david-smith-revolution-computing%2f&username=fbz_10170215&numStars=3" rel="nofollow" shape="rect" coords="38,0,55,28" target="_blank" />www.decisionstats.com%2f2009%2f05%2f29%2finterview-david-smith-revolution-computing%2f&username=fbz_10170215&numStars=4" rel="nofollow" shape="rect" coords="55,0,72,28" target="_blank" />www.decisionstats.com%2f2009%2f05%2f29%2finterview-david-smith-revolution-computing%2f&username=fbz_10170215&numStars=5" rel="nofollow" shape="rect" coords="72,0,91,28" target="_blank" />




www.decisionstats.com%2f%3ffeed%3drss2" rel="NOFOLLOW" shape="rect" coords="0,0,467,59" target="_blank" />



Loading mentions Retweet

Comments [0]

DecisionStats - More R please


                    
DecisionStats - More R please
                      
More R please                      
                    

some R news

0 The R Foundation Website I guess the www.r-project.org team is busy prettyfying before the annual R users conference kicks in- the website of www.r-project.org ( I was told it looks has the aesthetic visual appeal of dead cat splattered on the autobahn a very HTML 4.0 kind of retro look )  

I cant believe the R Site and R core honchos finds the following image the prettiest image to represent graphical abilities of R
  

The R core site has tremendous functionality and demand though I wonder if they can just put up some ads and get some funding/ two way research tie- up with Google —Google uses R extensively, and can help with online methods as well, and is listed as supporting organization at http://www.r-project.org/foundation/memberlist.html …..

The R archives are a collection of emails and thats not documentation at all - but

1 Revolution R Website and particularly David Smith’s blog is a great way to stay updated on R news at http://blog.revolution-computing.com/  

I have covered REvolution R before, and they are truly impressive.

http://www.decisionstats.com/2009/01/31/interviewrichard-schultz-ceo-revolution-computing/

It seems the domain name revolutioncomputing.com was squatted ( by NC?) so thats why the hyphenated web name. It is a very lucid website- though I do request them to put more video/podcasts and a Tweet this button would be great :))



and another more techie post here
  
http://blog.revolution-computing.com/2009/05/verifying-zipfs-powerdistribution-law-for-cities.html

Another great source is the Twitter - it seems that Twitter R users use the hashtag #rstats to search for R kind of news and code - that should help R bloggers and at a later date users.

Click here for checking it out

http://search.twitter.com/search?q=#stats

2 Some more R forums and sites

Forum for R Enterprise Users http://www.revolution-computing.com/forum

A R Tips Site http://onertipaday.blogspot.com/

The R Journal ( yes there is a journal for all hard working R fans) http://journal.r-project.org/  

R on Linkedin http://www.linkedin.com/groups?about=&gid=77616

and the Analytic Bridge community group for R
  
http://www.analyticbridge.com/group/rprojectandotherfreesoftwaretools

2 Here is a terrific post by Robert Grossman

at http://blog.rgrossman.com/2009/05/17/running-r-on-amazons-ec2/

I liked the way he built the case for using R on Amazon EC2 ( Business case not Use case) and then proceeded to a step by step tutorial simple and powerful blog post. I hope R comes out with a standardized Online R Doc like that which is a single point search able archive for code - something like the SAS online doc (which remains free for WPS users  ) but the way the web is evolving it seems the present mish mash method would continue


the main steps to use R on a pre-configured AMI.

Set up.
The set up needs to be done just once.

1. Set up an Amazon Web Services (AWS) account by going to:

aws.amazon.com.

If you already have an Amazon account for buying books and other items from Amazon, then you can use this account also for AWS.
  2. Login to the AWS console
  3. Create a “key-pair” by clinking on the link “Key Pairs” in the Configuration section of the Navigation Menu on the left hand side of the AWS console page.
  4. Clink on the “Create Key Pair” button, about a quarter of the way down the page.
  5. Name the key pair and save it to working directory, say /home/rlg/work.

Launching the AMI. These steps are done whenever you want to launch a new AMI.

1. Login to the AWS console. Click on the Amazon EC2 tab.
  2. Click the “AMIs” button under the “Images and Instances” section of the left navigation menu of the AWS console.
  3. Enter “opendatagroup” in the search box and select the AMI labeled
  “opendatagroup/r-timeseries.manifest.xml”, which
  is AMI instance “ami-ea846283″.
  4. Enter the number of instances to launch (1), the name of the key pair that you have previously created, and select “web server” for the security group. Click the launch button to launch the AMI. Be sure to terminate the AMI when you are done.
  5. Wait until the status of the AMI is “running.” This usually takes about 5 minutes.

Accessing the AMI.

1. Get the public IP address of the new AMI. The easiest way to do this is to select the AMI by checking the box. This provides some additional information about the AMI at the bottom of the window. You can can copy the IP address there.
  2. Open a console window and cd to your working directory which contains the key-pair that you previously downloaded.
  3. Type the command:
  ssh -i testkp.pem -X root@ec2-67-202-44-197.compute-1.amazonaws.com

Here we assume that the name of the key-pair you created is “testkp.pem.” The flag “-X” starts a session that supports X11. If you don’t have X11 on your machine, you can still login and use R but the graphics in the example below won’t be displayed on your computer.  

Using R on the AMI.

1. Change your directory and start R

#cd examples
  #R
  2. Test R by entering a R expression, such as:

> mean(1:100)
  [1] 50.5
  >
  3. From within R, you can also source one of the example scripts to see some time series computations:

> source(’NYSE.r’)
  4. After a minute or so, you should see a graph on your screen. After the graph is finished being drawn, you should see a prompt:

CR to continue

Enter a carriage return and you should see another graph. You will need to enter a carriage return 8 times to complete the script (you can also choose to break out of the script if you get bored with the all the graphs.
  5. When you are done, exit your R session with a control-D. Exit your ssh session with an “exit” and terminte your AMI from the Amazon AWS console. You can also choose to leave your AMI running (it is only a few dollars a day).

Acknowledgements: Steve Vejcik from Open Data Group wrote the R scripts and configured the AMI.  

Ajay-Terrific R companies, blogs, tweets, research and sites, but do let me know your feedback . Just un-other R day.

Loading mentions Retweet

Comments [0]

Using R on Amazon Ec2

A great post from http://blog.rgrossman.com/2009/05/17/running-r-on-amazons-ec2/

the main steps to use R on a pre-configured AMI.

Set up.
The set up needs to be done just once.

  1. Set up an Amazon Web Services (AWS) account by going to:

    aws.amazon.com.

    If you already have an Amazon account for buying books and other items from Amazon, then you can use this account also for AWS.

  2. Login to the AWS console
  3. Create a “key-pair” by clinking on the link “Key Pairs” in the Configuration section of the Navigation Menu on the left hand side of the AWS console page.
  4. Clink on the “Create Key Pair” button, about a quarter of the way down the page.
  5. Name the key pair and save it to working directory, say /home/rlg/work.

Launching the AMI. These steps are done whenever you want to launch a new AMI.

  1. Login to the AWS console. Click on the Amazon EC2 tab.
  2. Click the “AMIs” button under the “Images and Instances” section of the left navigation menu of the AWS console.
  3. Enter “opendatagroup” in the search box and select the AMI labeled
    “opendatagroup/r-timeseries.manifest.xml”, which
    is AMI instance “ami-ea846283″.
  4. Enter the number of instances to launch (1), the name of the key pair that you have previously created, and select “web server” for the security group. Click the launch button to launch the AMI. Be sure to terminate the AMI when you are done.
  5. Wait until the status of the AMI is “running.” This usually takes about 5 minutes.

Accessing the AMI.

  1. Get the public IP address of the new AMI. The easiest way to do this is to select the AMI by checking the box. This provides some additional information about the AMI at the bottom of the window. You can can copy the IP address there.
  2. Open a console window and cd to your working directory which contains the key-pair that you previously downloaded.
  3. Type the command:
    ssh -i testkp.pem -X root@ec2-67-202-44-197.compute-1.amazonaws.com

    Here we assume that the name of the key-pair you created is “testkp.pem.” The flag “-X” starts a session that supports X11. If you don’t have X11 on your machine, you can still login and use R but the graphics in the example below won’t be displayed on your computer.

Using R on the AMI.

  1. Change your directory and start R

    #cd examples
    #R
  2. Test R by entering a R expression, such as:

    > mean(1:100)
    [1] 50.5
    >
  3. From within R, you can also source one of the example scripts to see some time series computations:


    > source('NYSE.r')

  4. After a minute or so, you should see a graph on your screen. After the graph is finished being drawn, you should see a prompt:

    CR to continue

    Enter a carriage return and you should see another graph. You will need to enter a carriage return 8 times to complete the script (you can also choose to break out of the script if you get bored with the all the graphs.
  5. When you are done, exit your R session with a control-D. Exit your ssh session with an “exit” and terminte your AMI from the Amazon AWS console. You can also choose to leave your AMI running (it is only a few dollars a day).

Acknowledgements: Steve Vejcik from Open Data Group wrote the R scripts and configured the AMI.

Loading mentions Retweet

Comments [0]

Parallel Botnet using R and Wordpress on Servers

Loading mentions Retweet

Comments [0]