This project has you writing a program to analyze data collected on Amazon's top 50 books, collected between the years 2009 and 2019. For this project you will need to load data from a file, generate some common statistics from it, and generate a graph known as a histogram from this data.
Below is an example execution of this program on a file containing some of the Amazon data:
Enter a filename: bestsellers.csv
Histogram of Amazon Bestseller Ratings
--------------------------------------
3.2 3.3 *
3.4 3.5 3.6 *
3.7 3.8 **
3.9 ***
4.0 *************
4.1 *****
4.2 ****
4.3 *********************
4.4 ********************************
4.5 ***************************************
4.6 *****************************************************************************************
4.7 **************************************************************************************************
4.8 *************************************************************************************************
4.9 ***********************************
5.0 --------------------------------------
Total books rated: 440
Median score: 4.7
Average score: 4.6
Standard Deviation: 0.23
The first few lines of the input file bestsellers.csv (used above) looks like the following:
Name,Author,User Rating,Reviews,Price,Year,Genre
10-Day Green Smoothie Cleanse,JJ Smith,4.7,17350,8,2016,Non Fiction
11/22/63: A Novel,Stephen King,4.6,2052,22,2011,Fiction
12 Rules for Life: An Antidote to Chaos,Jordan B. Peterson,4.7,18979,15,2018,Non Fiction
1984 (Signet Classics),George Orwell,4.7,21424,6,2017,Fiction
A Dance with Dragons (A Song of Ice and Fire),George R. R. Martin,4.4,12643,11,2011,Fiction
A Game of Thrones / A Clash of Kings / A Storm of Swords / A Feast of Crows / A Dance with Dragons,George R. R. Martin,4.7,19735,30,2014,Fiction
Note that the first line of the file shows what each column holds. Our primary focus for this project will be the column holding the ratings for the books.
Histograms The histogram graph is a graph where the bars in the graph represent a count of some type. In the graph we are asking you to build, each bar represents the count of the number of books with that rating in the data file. For example, in the graph above there is 1 book that has a rating of 3.3, 1 book with a rating of 3.6, 2 books with a rating of 3.8 and so on. One of your tasks for this project is to generate a histogram report on the various datafiles that we have provided.
When you generate this graph, note that the minimum and maximum values in the graph should be determined by the data in the file - there should be 1 line with the rating lower than the smallest rating in the file and one line with the rating higher than the highest rating in the file. For example, in the graph above there is a rating for 3.2 because the minimum rating is 3.3. Below is an example from a second file bestsellersLow.csv that has a different range for ratings:
Enter a filename: bestsellersLow.csv
Histogram of Amazon Bestseller Ratings
--------------------------------------
3.7 3.8 **
3.9 ***
4.0 *************
4.1 *****
4.2 ****
4.3 *********************
4.4 ********************************
4.5 ***************************************
4.6 ********************
4.7 ****************
4.8 **************
4.9 ***********************************
5.0 --------------------------------------
Total books rated: 204
Median score: 4.5
Average score: 4.5
Standard Deviation: 0.27
Since this file has 3.8 as its lowest rating, it starts with a 3.7. And since 4.9 is the highest rating in the file, 5.0 is the last rating listed in the histogram graph.
Summary Statistics
At the bottom of the report you need to generate 4 additional statistics - the count of the number of ratings in the file, the median score, the average score and the standard deviation.
Median The median score is the "middle" score - given a list of scores, half the scores on the list will be larger than the median score and the other half will be smaller. An algorithm to find the median score is:
sort the list of scores
if the list is odd, the middle score is the median
if the list is even, add the two middle scores together and divide it by 2 to find the median
Average The average score is the sum of all of the scores divided by the count of the scores.
Standard Deviation The standard deviation is a measure of how "close" the data is to its average. A large standard deviation indicates data that is spread out far from its average while a small standard deviation indicates that the data is clumped close to its average. Roughly 95% of the data will be within 2 standard deviations of its average.

Respuesta :

For this project you will need to load data from a file, generate some common statistics from it, and generate a graph known as a histogram from this data, check the code given below to get it.

What is histogram?

A histogram is a visual representation of data points arranged into ranges that the user has chosen. The histogram resembles a bar graph in appearance.

// Analysis.java //

import java.io.File;

import java.io.FileNotFoundException;

import java.util.Scanner;

import java.util.ArrayList;

import java.util.Collections;

public class Analysis {

  // method to display the histogram for the input list of sorted ratings (ascending order)

  public static void displayHistogram(ArrayList<Double> ratings)

  {

      int nBooks ;

      double r;

      System.out.println("Histogram of Amazon Bestseller Ratings");

      System.out.println("---------------------------------------");

     

      // loop from ratings minimum-0.1 to maximum+0.1 in steps of 0.1

      for(r = ratings.get(0)-0.1;r <= ratings.get(ratings.size()-1)+0.1;r=r+0.1)

      {

          nBooks = 0;

          System.out.printf("%.1f ",r); // display the current rating

         

          // loop to count the number of books that have the same rating as r

          for(int i=0;i<ratings.size();i++)

          {

              // compare r with ith rating

              if(Math.abs(ratings.get(i) - r) <= 0.0001) // equal, then increment nBooks by 1

                  nBooks++;

              else if(ratings.get(i) > r) // ith rating > r, then exit the loop since ratings list are sorted in ascending order

                  break;

          }

         

          // display * corresponding to number of books having rating r

          for(int i=0;i<nBooks;i++)

              System.out.print("*");

          System.out.println();

      }

     

      System.out.println("---------------------------------------");

  }

 

  // method to compute and return the median rating from the input list of sorted ratings(ascending order)

  public static double median(ArrayList<Double> ratings)

  {

      if(ratings.size() > 0) // validate size of list > 0

      {

          if(ratings.size()%2 == 0) // even number of ratings

          {

              // median is the average of middle 2 ratings

              return (ratings.get((int)ratings.size()/2) + ratings.get((int)(ratings.size()/2)-1))/2;

          }

          else // odd number of ratings

              return (ratings.get((int)(ratings.size()-1)/2)); // median is the middle value

      }

     

      return 0;

  }

 

  // method to compute and return the mean of input list of sorted ratings (ascending order)

  public static double average(ArrayList<Double> ratings)

  {

      double totalRatings = 0; // initialize totalRatings to 0

     

      // loop over the list of ratings, adding each rating to totalRatings

      for(int i=0;i<ratings.size();i++)

          totalRatings += ratings.get(i);

     

      if(ratings.size() > 0) // list is not empty

          return totalRatings/ratings.size(); // compute average by dividing totalRating by number of ratings

     

      return 0; // list is empty

  }

 

  // method to compute and return the standard deviation of input list of sorted ratings (ascending order)

  public static double standardDeviation(ArrayList<Double> ratings)

  {

      double avg = average(ratings); // get the average of all ratings

     

      double stdDev = 0; // initailize the total of squared difference to 0

     

      // loop to calculate sum of square difference of each rating from mean

      for(int i=0;i<ratings.size();i++)

      {

          stdDev += Math.pow(ratings.get(i)-avg,2);

      }

     

      if(ratings.size() > 0) // list is not empty

          return Math.sqrt(stdDev/ratings.size());

     

      return 0; // list is empty

  }

 

  public static void main(String[] args) {

      String filename;

      Scanner scnr = new Scanner(System.in);

     

      // input the filename

      System.out.print("Enter a filename: ");

      filename = scnr.nextLine();

      scnr.close();

     

      // create a File object

      File file = new File(filename);

     

      try {

         

          // open the file in read mode

          scnr = new Scanner(file);

         

          // create an empty array list to store rating of each record in file

          ArrayList<Double> ratings = new ArrayList<Double>();

         

          String line = scnr.nextLine(); // read and discard the first header line

         

          // loop over the file line by line

          while(scnr.hasNextLine())

          {

              line = scnr.nextLine(); // read a line from file

              String[] fields = line.split(","); // split the line into array of String using comma as the delimiter

             

              // since rating is the 3rd field i.e index 2 in input file, convert it to double and insert it in the list

              ratings.add(Double.parseDouble(fields[2]));

          }

         

          scnr.close(); // close the input file

         

          Collections.sort(ratings); // sort the list in ascending order using the built-in sort method

         

          displayHistogram(ratings); // display the histogram

 

          // display number of records

          System.out.println("Total books rated: "+ratings.size());

         

          // display median, average and standard deviation of scores

          System.out.printf("Median Score: %.1f\n",median(ratings));

          System.out.printf("Average Score: %.1f\n",average(ratings));

          System.out.printf("Standard Deviation: %.2f\n",standardDeviation(ratings));

         

      } catch (FileNotFoundException e) {

          // file not found, display error and exit the program

          System.out.println("ERROR - File "+filename+" not found");

      }  

  }

}

//end of Analysis.java

Learn more about histogram

https://brainly.com/question/16738893

#SPJ4

ACCESS MORE