What is Big Data?
The publications seem to relate more to healthcare and information systems more than computer science. Big data is an important topic of discussion in IT, research, medical field, economics and business field. The term big refers, large or immense; and Data signifies raw information’s that when analyzed and processed provides idea and meaning to object or matter. Both these words when combined together provide a clear picture of big data that is large and can be in structured or unstructured form. Here structured denotes, data with high degree of organization, whereas unstructured refers to unorganized data that lack proper model. Previously we shared What is CS619 Final Project? How to Select and Complete Final Project.Big Data Analytics Assignment | Data Analytics in Various Health Care Fields |
- Published Papers Regarding Data Analytics in Various Health Care Field
- Big Data Analytics Theories and Techniques In Health Care
- Installing a Big Data Analytics framework (Hadoop or Spark)
- Installing R - Step by Step
- The Functionality Of R With Regard To Health Care
Uses of Big data analytics
Big data is defined as the raw data obtained from number of medium; studying these data, analyzing it and obtaining meaningful result out of them is called big data analytics. Below are some of the uses of big data analytics.- Enhancing the performance
- Improvement in Research and health care
- Development in business
- Strengthening Security and finding criminals
- To bring development in nation
Advantages of Big Data Analytics
- Error detection
- Predicting events
- Reduces time and effort
- Increases efficiency and accuracy
- Progress of company
Who uses Big Data?
Presently, there are number of companies that have developed software to use big data to compete in the growing market. Beside this, these useful tools are also used in different fields to develop sense from big data specific to the finding.These are the few popular companies that uses big data to fulfill customer requirements and accomplish their goal.
- HP
- Oracle
- Microsoft
- Splunk
- Amazon
1. Published Papers Regarding Data Analytics in Various Health Care Fields
This paragraph will discuss the results found in this paper. The publications seem to relate more to healthcare and information systems more than computer science, since computer science involves professionals and doctors only in healthcare analytics, paying less attention to involve individuals and patients, as we believe involving individuals will lead for better decision making and more accurate results for the future. Most healthcare analytics papers were published in the US as it shown in geographical distribution in Figure 7, and linking this results to the research approach distribution in Figure 8, we can notice that most publications in healthcare analytics were quantitative studies, so if we take a look again, we can conclude and relate that to the reason of why papers were decreased in the last 2 years, because quantitative studies are very expensive and costly, as well as it is focusing on involving professionals in healthcare analytics systems rather than patients, which will effect on having many inaccurate studies results, and create many struggles with healthcare analytics and kills it’s future, therefore researchers will stop publishing papers as they lost their spirit in healthcare analytics, so we recommend to involve individuals, as it would help society to adopt and improve healthcare data analytics systems, as well as running these systems efficiently and smoothly. (Alkhatib, Talaei-Khoei & Ghapanchi, 2016)
2. Big Data Analytics Theories and Techniques In Health Care.
Theory suggests that big data analysis technology, coupled with the proper business model (a topic for another blog), may be a disruption enabler in at least two ways:
- Enabling technologies make powerful data analytics affordable and accessible. Inexpensive data storage, massive computing power, and widespread information transfer infrastructures make predictive analytics available to small players than ever before. Patient population analytics, for example, used to be available only to the largest and wealthiest of health systems who could hire talented teams of statisticians. Now, solutions exist that are affordable even for small providers.
- Accessible and affordable analytics drive operating efficiencies. One third to one half of U.S. health care spend is on ineffective or inefficient care– hundreds of billions of dollars each year. Theory-based big data solutions can help change this. For example, theory-based (rather than simply correlative) population health management analytics updated with the latest best clinical practices from medical journals could help test, validate, and adopt new standards of care much more quickly than the average adoption timeline of 17 years and in so doing help obsolete outdated, wasteful, and expensive treatment methods While perhaps not as headline-making as “The End of Theory,” “Save Hundreds of Billions of Dollars With Better Health Outcomes” is pretty close. (Bean, 2016)
3. Installing a Big Data Analytics framework (Hadoop or Spark)
Step 1: Install Java
Firstly you need to install java successfully and then run the following commands.
- sudo apt-add-repository ppa:webupd8team/java
- sudo apt-get update
- sudo apt-get install oracle-java7-installer
· $scala -version
“”If the Java is installed then you will get to see the following:
java version “1.7.0_71”
Java(TM) SE Runtime Environment (build 1.7.0_71-b13)
Java HotSpot(TM) Client VM (build 25.0-b02, mixed mode)””
Step 2: Install Scala
Step two is to download the latest version of scala by visiting the www.scala-lang.org and then run the following commands.
- sudo mkdir /usr/local/src/scala
- sudo tar -xvf scala-2.11.7.tgz -C /usr/local/src/scala/
nano .bashrc - export SCALA_HOME=/usr/local/src/scala/scala-2.11.7
- export PATH=$SCALA_HOME/bin:$PATH
.bashrc
· $scala -version
If the Scala is successfully installed then you will get to see.
“”Scala code runner version 2.11.6 — Copyright 2002-2013, LAMP/EPFL””
Step 3: Install Git
Next step is to install Git. Command is given below.
- sudo apt-get install git
Step 4: Build Spark
The final step is to download the latest version of spark by visiting the www.spark.apache.org
- tar -xvf spark-1.4.1.tgz
- sbt/sbt assembly
Issues and Recommendation
Although Spark is reported to work up to 100 times faster than Hadoop in certain circumstances, it does not provide its own distributed storage system. Spark does not include its own system for organizing files in a distributed way (the file system) so it requires one provided by a third-party. For this reason many Big Data projects involve installing Spark on top of Hadoop, where Spark’s advanced analytics applications can make use of data stored using the Hadoop Distributed File System (HDFS). (Marr, 2016)
4. Installing R - Step by Step
- Once you have installed R on a Windows computer (following the steps above), you can install an additional package by following the steps below:
- To start R, follow either step 2 or 3:
- Check if there is an “R” icon on the desktop of the computer that you are using. If so, double-click on the “R” icon to start R. If you cannot find an “R” icon, try step 3 instead.
- Click on the “Start” button at the bottom left of your computer screen, and then choose “All programs”, and start R by selecting “R” (or R X.X.X, where X.X.X gives the version of R, eg. R 2.10.0) from the menu of programs.
- The R console (a rectangle) should pop up.
- Once you have started R, you can now install an R package (eg. the “rmeta” package) by choosing “Install package(s)” from the “Packages” menu at the top of the R console. This will ask you what website you want to download the package from, you should choose “Ireland” (or another country, if you prefer). It will also bring up a list of available packages that you can install, and you should choose the package that you want to install from that list (eg. “rmeta”).
- This will install the “rmeta” package.
- The “rmeta” package is now installed. Whenever you want to use the “rmeta” package after this, after starting R, you first have to load the package by typing into the R console:
> library("rmeta")
Issues and recommendation
In my healthcare data, I wanted to convert dollar values to integers (ie. $21,000 to 21000), and I used gsub . wanted to focus on the Payment estimate.
- So I used the melt() function that is part of reshape2.
- With my data melted, I wanted to get the average estimate for heart attack patients by state.
5. The Functionality Of R With Regard To Health Care
Researcher introduces the utilization of R as a tool for analyzing their data. In the past 5 years since the inception of this book, the number of R applications has exploded. It is not possible to do justice to R’s diverse capabilities in a few words. Using R to analyze the data, because of the ease of interactive exploration and making visualizations. (Seefeld & Linder, 2016)
Refrences:
Alkhatib, M., Talaei-Khoei, A., & Ghapanchi, A. (2016). Analysis of Research in Healthcare Data Analytics. arXiv preprint arXiv:1606.01354.Bean, D. (2016). Big Data: The end of theory in healthcare? - Christensen Institute. Christensen Institute. Retrieved 31 October 2016, from http://www.christenseninstitute.org/blog/big-data-the-end-of-theory-in-healthcare/
Marr, B. (2016). Forbes Welcome. Forbes.com. Retrieved 31 October 2016, from http://www.forbes.com/sites/bernardmarr/2015/06/22/spark-or-hadoop-which-is-the-best-big-data-framework/#1bfc431d532c
Seefeld, K. & Linder, E. (2016). Statistics Using R with Biological Examples. https://cran.r-project.org. Retrieved 31 October 2016, from https://cran.r-project.org/doc/contrib/Seefeld_StatsRBio.pdf
Parmar, D. (2016). R Packages: A Healthcare Application. Datasciencecentral.com. Retrieved 31 October 2016, from http://www.datasciencecentral.com/profiles/blogs/r-packages-a-healthcare-application
Post a Comment