Genres in streaming music industry

Cloud-Music

Just this other day, Google announced its Google Play Music at their Google I/O . And the rumors of Apple iRadio and the challenges involved have been in the news for quite sometime now. Some say google beat Apple in the world of Internet music. But if you look closely, there is a fundamental difference in the approaches that Google and Apple have taken in the field of digital streaming music. To understand this, lets look at two of the main runners in this field today – Pandora and Spotify.

Although both Pandora and Spotify are common in their goals to deliver music to consumers, without actually a need to purchase the albums, their ways of doing it are different. While Spotify follows more of an on-demand, subscription based model, Pandora has resorted to a webcasting service. This difference is not as apparent on the desktops and laptops, where both parties offer free access to users with ads injected rather frequently. But its not at that shocking a revelation to note that the number of users using their desktops for music have diminished dramatically in the last decade or so. In the handhelds segment, Spotify offers a 30-day free trial, followed by a $9.99 per month subscription and Pandora has extended the same model as they have with desktop. The result is the emergence of two schools of thought – The Pandora Model and the Spotify Model.

Google has decided to follow the spotify model , with a $7.99/month fee for those who subscribe before June 30th and $9.99/month for those after. Of course the 30 day free trial is always a requirement. Rumor has it that Apple will follow the Pandora model , going by its challenges to obtain copyrights, just as Pandora has been criticized for their small collection again owing to the copyright issues.

So now we have two genres in the Internet music industry. Pandora and Spotify have been equally successful in their own respects and when two big players such as Google and Apple decide to take separate stance on their approach to digital music, we now have the battle of the strategies. Only time will tell which of those will go ahead. Until then, lets enjoy the competition!

Advertisement

Ngrams and Google

nextgov-medium

When talking about Big Data, one initiative is worth a mention – The Google Ngrams . Google, in its on magnanimous way, started an program to digitize every single printed document, within the copyright limits back, in 2004. Started as a partnership with some of the well-known libraries around the globe such as the New York Public Library, the Harvard University Library and
 Bodleian Library at University of Oxford , the plan was to make high resolution digital images of all printed documents – books magazines et al – and save them in a huge repository that is searchable through books.google. com.

As the collection grew, Google realized the potential to actually digitize them one word at a time. Through a tool known as reCAPTCHA they then started to extract every word from every single image that was scanned. What was born out of it was an amazingly large data set from words dating back to 1500. By 2012, they had almost 15% of all the printed books digitized and that amounted to almost 700 billion words! What came out of this was Google Ngrams !

An “ngram” is a sequence of letters of any length, which could be a word, a misspelling, a phrase or gibberish

Google Ngrams is a searchable word repository, which graphs the occurrence of a word or a phrase in a “corpus of books” (as Google themselves puts it). It then plots those occurrences across time and the result is a visualization of how frequent the words were used over time.

As curious as I was, I decided to try out a few of the “jargons” of today to see how far back it was used. The results were alarming!

Internet

technology

The word “technology” (keep in mind the search is case sensitive) was used as long back as early 1500s, which is ok considering it is quite a defined term in the English dictionary. But what was even more puzzling is that the word “Internet” was used in the 1590s! Now what can that be referred to! Also, although the whole slew of ARPANET and packet switching started to evolve in the 1960s it wasn’t until 1990s when the word “Internet” started to be used widely in printed form!

Not only SQL but also ….

nosql_440

Keeping with the theme of Big Data, as we spoke a couple of days back , the concept of N=all suddenly started to give rise to a whole slew of new challenges – that which is an obvious consequence of dealing with such large chunks of data. Storage and retrieval! The ability to quickly retrieve, analyze and correlate data to derive information becomes essential when it comes to dealing with big data. And for such massive amounts of data, relational databases do not seem to jive all that well. One of the major reasons for this is the fact that relational (although I may now safely call it, the traditional) databases require a structure to the data that it can store. Now when you are trying to correlate between the users’ location data Vs the local deals (as an example) and add on the users’ personal credit card usage, the data does not always fall into a structured pattern for it to be stored in a relational database. Along came NoSQL . The name was borrowed from the 1998 open source RDMS developed by Carlo Strozzi, and was later popularized by Eric Evans of Rackspace.

Unlike SQL or any of the other traditional databases, noSQL can be viewed more as a collective term for a variety of new data storage backends, with the concept of transactions taken out of it. With its eternally loose definitions, a noSQL can possibly aggregate data from rows that span across multiple tables in a traditional relational database. Now this obviously results in enormous chunks of data posing storage challenges. However with the costs associated with storage decreasing rapidly, this can be ignored when compared to the potential that you now have. Couchbase , one of those companies that have caught on quickly to this new revolution in data storage and retrieval with its document-oriented database technology, outlines an interesting article on why noSQL .

They are not the only ones that have grown into this new idea. Hadoop , is yet another one of those, that has quickly become a new household name. Developed and sustained by a group of unpaid volunteers, Hadoop is a framework to process large data sets, perhaps know as big data. Rumored to have been spun off as a free implementation of Google MapReduce , several big names have built services and solutions around this framework, some of the notable ones being Amazon Web Services (AWS), VMWare Hadoop Virtual Extensions (HVE), IBM BigInsights.

Yet another database that has been gaining popularity off late is MongoDB – a project spun off by 10Gen . Like Couchbase, this is also a document-oriented database and has started to pick up several implementations including SAP, MTV and Sourceforge.

With an “unstructured” database comes the challenges of querying it. Mongo uses a skewed version of JSON (known as BSON or Binary JSON) for representing queries whereas Couchbase has adopted a SQL-like query language that is slowly becoming a standard world wide, known as unQL (Unstructured Query Language).

While all these are still in the nascent stages of development, as the big data wave is rapidly approaching it peak, let me leave you with a slide deck from the QCon London 2013 presented by Matt Asay, VP of Corporate Strategy at 10gen on the “Past, Present and Future of noSQL.

Digitizing the cash counters

square-20100511-600

I’m sure the image that you see above has become quite a familiar sight across America. Apple stores have been flaunting a similar version for quite a while now, which almost resembles the Mophie . I first noticed this at Conshohocken Cafe , a quaint little breakfast place at Conshohocken, PA. Square , as they call it, they started to make money through the 2.75% transaction fee charged per swipe. Now my post was not particularly to about the Square, but instead, the Square Stand , that was announced today. At $299 a piece and a $499 iPad, this can replace the traditional cash registers in a blink of an eye. Sounds quite simple, as we start to see more and more dependency on the mobile device .

But wait, there is more. The exact same day, Paypal decides to announce its revolutionary product know as the Cash for Register . With a free credit/debit/paypal processing for the rest of the year for any qualifying US Business, we now have a competition!

The era of cash registers which opens up a “slot machine” of quarters and pennies is slowly disappearing. Whether its paypal or square, the digital revolution has spared none. Soon the traditional cash registers will just be a piece of antique in the museum!

era410

The new library of Alexandria – the power of Big Data

all-you-need-to-know-about-big-data

A term that has been gaining substantial amount of curiosity in the recent past and perhaps one that would keep growing in importance as the era of Internet and the information flow starts to become more widely available, is Big Data. Although the word has been ringing all around me and my place of work for quite sometime, what really triggered my interest are two books that I am currently alternating between – “ The long Tail by Chris Anderson , a book that describes how endless choice is creating unlimited demand, and Big Data by Viktor Mayer Schonberger and Kenneth Cukier, a book that sets forth to describe the concept that would revolutionize the way we live and think.

Wikipedia defines big data as

a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications

But perhaps a more fitting definition is one that is described in the book “Big Data” – a large set of data derived from a sample size N where

N=all

. The reason I find the latter more befitting is because, data sets do not always have to be big as long as it encompasses the entire world. For e.g. the book describes a study done on the corruption in Sumo Wrestling in Japan . The study collected data from almost 65,000 matches across 7 years in Japan to find a correlation. The data in this case was not as big as one would imagine it to be. But the fact that it “surveyed” the entire set of matches across those 7 years, rather than limiting itself to certain samples from those, made me lean towards calling it a “big data”.

Big data changes the fundamental aspect of life by giving it a quantitative dimension

says Viktor and Kenneth in their book. Humans have long tried to quantify several aspects of human behavior in order to gain insights to perform predictive analysis. Now one of the terms that I used in my previous paragraph is of interesting relevance – “survey”. Surveys perhaps were one such primitive form of gathering relevant data. One of the major challenges of a survey was the fact that your sample size is now N < all, which means that you now have the data associated with the population that actually took your survey. The results then become biased to the characteristics of that limited population, which does not neccessarily portray the entirety. As this problem started to evolve, statisticians found that the results were perhaps more accurate if the sample set of the population was chose at random, rather than just increasing the sample size. Studies have shown that extrapolating the survey done on a random sample set yield a more accurate results as compared to a large sample size across a specific set of the population. Now this still does not solve one of the challenges that I'd like to call as active polling vs passive polling. In almost all cases, a survey deals with the study of a specific set of questions answered by a specific group of people or simply put, a survey is an active polling. To truly understand a human behavior, this would prove to be inaccurate especially because when answering a question, humans tend to stop and think. THis can be quite analogous to studying the human nature when interacting with a group of people, by having a tutor or a professor in the group. The mere awareness of a study being conducted could potentially skew the behavior. Whereas, if the same group of people can be "passively" observed, the information gathered can be closer to being accurate. The same can be told about any methods of predictive analysis. Big Data analysis methodologies in my view prove to be far more passive in its ways of polling data and hence tend to lean more towards being accurate.

In the coming weeks, as I wander through the world of Big Data, I plan to post more examples and insights into this amazing field that has been gaining significant relevance in today's world. I plan to talk about one aspect in each of my posts so as to limit yet another challenge of big data, known as information overload! But that does not entirely solve the problem. My plan is also to engage more interaction among my reader to gain more information, as I meander through. Feel free to enthral me with your comments.

Project Mighty and Napoleon – Adobe version of stylus and ruler

Does anyone remember the ancient geometry box – the one that was a common sight in middle and high schools, back in the 80s and 90s? If not, here is something that could spark those memory cells.

tin-geometry

Now, fast forward to 2013 and to Adobe’s MAX Conference 2013 held this week at Los Angeles. Although the focus was mainly on the cloud offering of their wonderfully successful creative suite, something interesting sneaked in at the tad end – a project they called Projects Mighty and Napoleon . The names don’t reveal much, just as the website does not. But the concept talked about a stylus and a ruler (!!!). The idea is to let those creative elite to now use their iPads effectively; to connect directly to the cloud and to apps such as TypeKit and Kuler . Again, not much revealed, but these images do show some potential.

Project Mighty

Exactly how much the stylus would be useful outside the Adobe suite of apps is yet unknown. Having been a long time proponent of a “stylus-like tool for iPad, esp. while taking notes during meetings, this does come as a welcome delight. And of course, this could possibly change the way geometry is taught in schools!