The Geography of Taste: Using Yelp to Study Urban Culture
Abstract
:1. Introduction
- To what extent is taste a good indicator of socioeconomic status of communities in American cities?
- By utilizing the concept of taste, can we use restaurant-as-sensor instead of citizen-as-sensor to examine the socioeconomic dynamics of neighborhoods? This issue is especially important to us since business data is far more accessible and plentiful than individual-level data.
- Are American cities comprised of regions with different dominant taste cultures? Are different regions in every city similar to regions from other cities?
2. Materials and Methods
2.1. Literature Review
2.1.1. Previous Attempts to Define Sociospatial Boundaries
2.1.2. Taste as an Indicator of Urban Culture
2.1.3. How Can Information about Restaurants Help Us Understand the Socioeconomic and Cultural Structure of Cities?
2.2. Data
- Data provided by Yelp [57] which includes 11 cities, 8 of which are in North America (i.e., Cleveland, Pittsburgh, Charlotte, Urbana-Champaign, Phoenix, Las Vegas, Toronto, and Montreal). This data includes 4.1 M reviews by 1 M users for 144 K businesses as well as 1.1 M business attributes (e.g., hours, parking availability and ambience). For the case of Montreal, of 86,054 reviews 11,284 were in French, as identified through langdetect 1.0.7 package in Python [58]. Since English reviews may not equally represent all Montreal neighborhoods, demographics, and resident population, we considered Montreal as an outlier and removed it from our analysis. For this study we were only interested in restaurants in English-speaking North American metropolitan areas therefore, we filtered out Montreal and Urbana-Champaign (a small city) as well as points that fell out of the metropolitan boundaries. Also, we only used businesses tagged as restaurants. This process resulted in 2,186,054 reviews for 34,231 restaurants. This data includes the following fields: Business ID, User ID, Reviews, Business Name, Star Rating, Address, City, State, Zip code, Business Category, Review Count, Longitude, Latitude. The geographic coordinates represent the location of businesses.
- As we discussed in the introduction section, we intended to see if we can characterize the socioeconomic status of urban communities without having information about users. This is very important, because although it is possible to scrape data from different websites such as Yelp, the user IDs are often not provided in the interface and cannot be scraped easily. In other words, extracting information from businesses from the web is often easier than finding individual-level data. To investigate the extent to which business-level data scraped from the web and stripped from user IDs can inform us about neighborhoods, we scraped restaurant reviews and attributes for Boston, Washington D.C., Detroit and Philadelphia metropolitan areas. The data collection process took a few months in 2017, the same year as the Yelp contest data. All these cities are characterized by high segregation as well as ethnic and cultural diversity. This data includes 509,319 reviews for 120,801 restaurants. Using the earlier dataset, we expect to be able to study the communities in this dataset where the user IDs are absent. In addition, the four cities are important metropolitan areas and studying the sociospatial dynamics of these cities can be useful per se. Table 1 provides a summary of these the data.
2.3. Methodology
2.3.1. Feature Generation
- First, we used English stop-words to remove commonly-used words [60] and then, chose features among the top 1000 frequent words. Forty-five features of the three categories (i.e., foods and drinks, food adjectives, and ambience adjectives) were selected at this step (Appendix A).
- Although frequent features can provide much information for restaurants, we expect to get more specific words from the comments. For example, different types of fish (e.g., haddock, tilapia) or different adjectives used to describe an ambience (e.g., divey, hipster) are not among frequent words. To address this problem, we used the Word2Vec model. This open-source model was developed by Google in 2013 which transforms words in a document to high-dimensional spatial vectors by using a Neural Network Language Model (NNLM) [61,62]. Given N user comments and the n-th word in the comment and the window size of the context centered on the n-th word as , the maximum likelihood function of the NNLM model will be as follows:
- We binarized the number of words selected from the last step in each comment (1 word exist 0 otherwise) and aggregated them for every restaurant. Given that these words are not equally common we use Term Frequency-Inverse Document Frequency model (TF-IDF) to weight these features:
- The features generated in the previous steps can sometimes fall into categories which can be even more important than the individual features themselves. For example, specific fish types (e.g., salmon) might be important but less informative than the combination of all types of fish. This information tells us that seafood is popular in a certain area. Appendix B indicates the groups of features that we combined in order to generate new features. By including these new features, a total of 477 potentially-unnecessary features remain (e.g., does the word “water” really explain anything about a community’s taste?). In the next step, we explain our methodology for reducing the dimensionality and choosing the most important features.
2.3.2. User’s Taste and the Curse of Dimensionality
2.3.3. Defining the Spatial Bins
3. Results
3.1. Selected Features
3.2. Clustering Results
4. Conclusions
Author Contributions
Funding
Conflicts of Interest
Appendix A
Category | List of Features |
---|---|
Food | chicken, pizza, ketchup, cheese, salad, hot dog, burger, bacon, burrito, mushroom, fish, wings, strawberry |
Drink | coffee, tea, beer, soda, water, wine, cocktail, alcohol, smoothie |
Food adjectives | Mexican, Italian, Chinese, sweet, fried, spicy, vegetarian, greasy, homemade, juicy, organic, stuffed, crispy |
Ambiance | cozy, hipster, trendy, classy, modern, homey, intimate, romantic, upscale, divey |
Appendix B
New Feature | List of Combined Features |
---|---|
steak_types | meatloaf, Barclay, flank, wagyu, kalbi, tenderloin, striploin, bavette, rib, brisket, mignon, steak, ribeye |
meat_types | chicken, meat, beef, pork, lamb, veal, duck, turkey, steak |
sweets_types | yogurt, gelato, pudding, cupcake, biscuit, pie, tiramisu, crepe, custard, tart, sorbet, Nutella, cheesecake, cream, cannoli, muffin, donut, cookie, cake, shake |
fast_food | pizza, hot fog, sandwich, burger, chips, pepperoni, max, finger, cheeseburger, cheesesteak, calzone, meatball, hoagie, poutine, blt, Rueben, wing |
vegie_types | turnip, lettuce, celery, seaweed, parsley, scallion, eggplant, broccoli, zucchini, kale, cilantro, veggie, ceasar, cabbage, cucumber, basil, vegetable, mushroom, sprout, carrot, asparagus, bean, onion, tomato, coleslaw, avocado, spinach, artichoke |
breakfast_types | bacon, sausage, egg, benedict, scramble, omelet, bagel, pancake, croissant, pretzel, syrup, waffle, roast |
fruite_types | pineapple, peach, strawberry, raspberry, blueberry, coconut, apple, mango, banana, orange |
nut_types | walnut, pecan, peanut, almond |
herb_types | oregano, thyme, fennel, sumac, paprika, garnish, herb, radish, chive, dill, arugula, mint |
dressing_types | ranch, ketchup, mayo, gravy, marinara, sriracha |
coffee_types | espresso, cappuccino, decaf, americano, mocha, latte |
soda_types | Pepsi, Fanta, spirit, coke, soda |
softliq_types | champagne, beer, wine, margarita, sangria, mimosa, cider |
hardliq_types | tequila, whiskey, vodka, martini, bourbon, shot |
ethnic_food | Thai, Chinese, Mexican, Italian, Asian, Indian, Japanese, Vietnamese, Hawaiian, Sicilian, Arabic, Middle Eastern, Korean, Taiwanese, Persian, Greek, Lebanese, Portuguese, Ethiopian, Spanish |
latin_types | salsa, burrito, quesadilla, taco, carnitas, tamale, guacamole, tapa, enchilada, tortilla, fajita, carne, jalapeno, nacho, ceviche, empanada |
Italian_types | pastrami, panini, lasagna, bruschetta, pasta, prosciutto, stromboli, vermicelli, risotto, spaghetti, pesto, chorizo, gnocchi |
Asian_types | fusion, sesame, wonton, spring roll, omakas, sushi, aman, tofu, kimchi, nigiri, sashimi, mushi, noodle, teriyaki |
Mideast_types | shawarma, flatbread, pita, naan, hummus, falafel |
pos_ambience | cozy, homey, classy, trendy, artsy, urbane, posh, swanky, upscale, festive, romantic, eclectic, elegant, chic, stylish |
neg_ambience | casual, divey, kitschy, masculine |
style_stypes | hipster, hippie, bohemian, rustic, modern, minimalistic, contemporary, retro, deco, quaint |
material_types | wooden, hardwood, marble, concrete, mosaic, metal, steel, brick |
References
- Musterd, S.; Ostendorf, W. Urban Segregation and the Welfare State: Inequality and Exclusion in Western Cities; Routledge: Abingdon, UK, 2013. [Google Scholar]
- Sassen, S. The Global City; Princeton University Press: Princeton, NJ, USA, 1991. [Google Scholar]
- Badcock, B. Restructuring and spatial polarization in cities. Prog. Hum. Geogr. 1997, 21, 251–262. [Google Scholar] [CrossRef]
- Dear, M.; Flusty, S. Postmodern Urbanism. Ann. Assoc. Am. Geogr. 1998, 88, 50–72. [Google Scholar] [CrossRef]
- Wellman, B. The Community Question: The Intimate Networks of East Yorkers. Am. J. Sociol. 1979, 84, 1201–1231. [Google Scholar] [CrossRef]
- Wenger, G.C. A Comparison of Urban with Rural Support Networks: Liverpool and North Wales. Ageing Soc. 1995, 15, 59–81. [Google Scholar] [CrossRef]
- Hampton, K.; Wellman, B. Neighboring in Netville: How the Internet supports community and social capital in a wired suburb. City Community 2003, 2, 277–311. [Google Scholar] [CrossRef]
- Madanipour, A.; Cars, G.; Allen, J. Social Exclusion in European Cities: Processes, Experiences, and Responses; Psychology Press: Abingdon, UK, 2000; Volume 23. [Google Scholar]
- Lyons, W.E.; Lowery, D. Citizen Responses to Dissatisfaction in Urban Communities: A Partial Test of a General Model. J. Politics 1989, 51, 841–868. [Google Scholar] [CrossRef]
- Bracken, I.; Martin, D. The generation of spatial population distributions from census centroid data. Environ. Plan. A 1989, 21, 537–543. [Google Scholar] [CrossRef] [PubMed]
- Zheng, Y.; Capra, L.; Wolfson, O.; Yang, H. Urban Computing: Concepts, Methodologies, and Applications. ACM Trans. Intell. Syst. Technol. 2014, 5, 1–55. [Google Scholar] [CrossRef]
- Hennion, A. Those things that hold us together: Taste and sociology. Cult. Sociol. 2007. [Google Scholar] [CrossRef]
- Kant, I. Critique of Judgment; Hackett Publishing Company: Indianapolis, IN, USA, 1787. [Google Scholar]
- Cummings, J. The Theory of the Leisure Class. J. Political Econ. 1899. [Google Scholar] [CrossRef]
- Bourdieu, P. Distinction: A Social Critique of the Judgment of Taste; Routledge: Abingdon, UK, 1984; Volume 1. [Google Scholar]
- Rinzivillo, S.; Mainardi, S.; Pezzoni, F.; Coscia, M.; Pedreschi, D.; Giannotti, F. Discovering the Geographical Borders of Human Mobility. Künstl. Intell. 2012, 26, 253–260. [Google Scholar] [CrossRef]
- Ratti, C.; Sobolevsky, S.; Calabrese, F.; Andris, C.; Reades, J.; Martino, M.; Claxton, R.; Strogatz, S.H. Redrawing the map of Great Britain from a network of human interactions. PLoS ONE 2010, 5. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Yuan, J.; Zheng, Y.; Xie, X. Discovering regions of different functions in a city using human mobility and POIs. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Beijing, China, 12–16 August 2012; pp. 186–194. [Google Scholar] [CrossRef]
- Yuan, N.; Zhang, F.; Lian, D.; Zheng, K. We know how you live: Exploring the spectrum of urban lifestyles. In Proceedings of the First ACM Conference on Online Social Networks, Boston, MA, USA, 7–8 October 2013; pp. 3–14. [Google Scholar] [CrossRef]
- Liu, H. Social network profiles as taste performances. J. Comput. Commun. 2007, 13, 252–275. [Google Scholar] [CrossRef]
- Rahimi, S.; Liu, X.; Andris, C. Hidden style in the city: An analysis of Geolocated Airbnb rental images in Ten Major Cities. In Proceedings of the 2nd ACM SIGSPATIAL Workshop on Smart Cities and Urban Analytics, Burlingame, CA, USA, 31 October–3 November 2016. [Google Scholar]
- Cranshaw, J.; Hong, J.I.; Sadeh, N. The Livehoods Project: Utilizing Social Media to Understand the Dynamics of a City. In Proceedings of the 6th International AAAI Conference on Weblogs and Social Media, Dublin, Ireland, 4–7 June 2012; pp. 58–65. [Google Scholar]
- Yin, Z.; Cao, L.; Han, J.; Zhai, C.; Huang, T. Geographical topic discovery and comparison. In Proceedings of the 20th International Conference on World Wide Web, Hyderabad, India, 28 March–1 April 2011; pp. 247–256. [Google Scholar]
- Wakamiya, S.; Lee, R. Crowd-sourced Urban Life Monitoring: Urban Area Characterization based Crowd Behavioral Patterns from Twitter Categories and Subject Descriptors. In Proceedings of the 6th International Conference on Ubiquitous Information Management and Communication, Kuala Lumpur, Malaysia, 20–22 February 2012. [Google Scholar] [CrossRef]
- Li, Q.; Zheng, Y.; Xie, X.; Chen, Y.; Liu, W.; Ma, W.Y. Mining user similarity based on location history. In Proceedings of the 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Irvine, CA, USA, 5–7 November 2008. [Google Scholar]
- Hung, C.-C.; Hung, C.; Chang, C.-W.; Chang, C.; Peng, W.; Peng, W.-C. Mining Trajectory Profiles for Discovering User Communities. In Proceedings of the 2009 International Workshop on Location Based Social Networks, Seattle, WA, USA, 3 November 2009. [Google Scholar] [CrossRef]
- Zheng, Y.; Xie, X.; Ma, W. GeoLife: A Collaborative Social Networking Service among User, Location and Trajectory. IEEE Data Eng. Bull. 2010, 33, 32–40. [Google Scholar]
- Xiao, X.; Zheng, Y.; Luo, Q.; Xie, X. Finding similar users using category-based location history. In Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, San Jose, CA, USA, 2–5 November2010. [Google Scholar] [CrossRef]
- He, J.; Chu, W.W. A Social Network-Based Recommender System (SNRS); Springer: Boston, MA, USA, 2010; Volume 12, ISBN 9781441962867. [Google Scholar]
- Bonhard, P.; Harries, C.; McCarthy, J.; Sasse, M. Accounting for taste: Using profile similarity to improve recommender systems. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Montréal, QC, Canada, 22–27 April 2006; pp. 1057–1066. [Google Scholar] [CrossRef]
- Knulst, W.; Kraaykamp, G. Trends in leisure reading: Forty years of research on reading in The Netherlands. Poetics 1998, 26, 21–41. [Google Scholar] [CrossRef]
- Van Eijck, K. Social Differentiation in Musical Taste Patterns. Soc. Forces 2001, 79, 1163–1185. [Google Scholar] [CrossRef]
- Lewis, K.; Kaufman, J.; Gonzalez, M.; Wimmer, A.; Christakis, N. Tastes, ties, and time: A new social network dataset using Facebook.com. Soc. Netw. 2008, 30, 330–342. [Google Scholar] [CrossRef] [Green Version]
- Zukin, S. Urban Lifestyles: Diversity and Standardisation in Spaces of Consumption. Urban Stud. 1998, 35, 825–839. [Google Scholar] [CrossRef]
- Harvey, D. The Condition of Postmodernity; Blackwell: Oxford, UK, 1991; Volume 67, ISBN 0-631-16292-5. [Google Scholar]
- Lash, S.; Urry, J. Economies of Signs and Space; Sage: Newcastle upon Tyne, UK, 1994; ISBN 0803984723. [Google Scholar]
- Zukin, S. The Cultures of Cities; Wiley-Blackwell: Hoboken, NJ, USA, 1996; Volume 25, ISBN 1557864373. [Google Scholar]
- Mullins, P.; Natalier, K.; Smith, P.; Smeaton, B. Cities and Consumption Spaces. Urban Aff. Rev. 1999, 35, 44–71. [Google Scholar] [CrossRef]
- Bocock, R. Consumption; Routledge: London, UK; New York, NY, USA, 1993. [Google Scholar]
- Featherstone, M. Consumer Culture and Postmodernism; Sage: Newcastle upon Tyne, UK, 1991; ISBN 9781412910132. [Google Scholar]
- Clarke, S. The World of Consumption; Routledge: London, UK; New York, NY, USA, 1994. [Google Scholar]
- Miller, D. Consumption Studies as the Transformation of Anthropology. In Acknowledging Consumption; Routledge: New York, NY, USA, 1995; pp. 272–301. ISBN 0415106893. [Google Scholar]
- Neal, Z.P. Culinary deserts, gastronomic oases: A classification of US cities. Urban Stud. 2006. [Google Scholar] [CrossRef]
- Schlosser, E. Fast Food Nation: The Dark Side of the All-American Meal; Houghton Mifflin Harcourt: Boston, MA, USA, 2012. [Google Scholar]
- Yelp. Yelp Information; Yelp: San Francisco, CA, USA, 2017. [Google Scholar]
- Luca, M.; Zervas, G. Fake It Till You Make It: Reputation, Competition, and Yelp Review Fraud. Manag. Sci. 2016. [Google Scholar] [CrossRef]
- Zukin, S.; Lindeman, S.; Hurson, L. The omnivore’s neighborhood? Online restaurant reviews, race, and gentrification. J. Consum. Cult. 2017. [Google Scholar] [CrossRef]
- Rahimi, S.; Andris, C.; Liu, X. Using Yelp to Find Romance in the City: A Case of Restaurants in Four Cities. In Proceedings of the 3rd ACM SIGSPATIAL Workshop on Smart Cities and Urban Analytics, Redondo Beach, CA, USA, 7–10 November 2017. [Google Scholar]
- Hu, B.; Ester, M. Spatial topic modeling in online social media for location recommendation. In Proceedings of the 7th ACM Conference on Recommender Systems, Hong Kong, China, 12–16 October2013. [Google Scholar]
- Harrison, C.; Jorder, M.; Stern, H.; Stavinsky, F.; Reddy, V.; Hanson, H.; Waechter, H.; Lowe, L.; Gravano, L.; Balter, S. Using Online Reviews by Restaurant Patrons to Identify Unreported Cases of Foodborne Illness—New York City, 2012–2013. Morb. Mortal. Wkly. Rep. 2014, 63, 441–445. [Google Scholar]
- Griffis, H.M.; Kilaru, A.S.; Werner, R.M.; Asch, D.A.; Hershey, J.C.; Hill, S.; Ha, Y.P.; Sellers, A.; Mahoney, K.; Merchant, R.M. Use of social media across US hospitals: Descriptive analysis of adoption and utilization. J. Med. Internet Res. 2014. [Google Scholar] [CrossRef] [PubMed]
- Nsoesie, E.O.; Kluberg, S.A.; Brownstein, J.S. Online reports of foodborne illness capture foods implicated in official foodborne outbreak reports. Prev. Med. 2014. [Google Scholar] [CrossRef] [PubMed]
- Xiang, Z.; Du, Q.; Ma, Y.; Fan, W. A comparative analysis of major online review platforms: Implications for social media analytics in hospitality and tourism. Tour. Manag. 2017. [Google Scholar] [CrossRef]
- Tang, D.; Qin, B.; Liu, T. Document Modeling with Gated Recurrent Neural Network for Sentiment Classification. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 17–21 September 2015. [Google Scholar]
- Salinca, A. Business Reviews Classification Using Sentiment Analysis. In Proceedings of the 17th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, Timisoara, Romania, 24–27 September 2016. [Google Scholar]
- Arribas-Bel, D. Accidental, open and everywhere: Emerging data sources for the understanding of cities. Appl. Geogr. 2014, 49, 45–53. [Google Scholar] [CrossRef]
- Yelp Dataset Challenge Yelp Dataset Challenge. 2017. Available online: https://www.yelp.com/dataset_challenge (accessed on 15 February 2017).
- Danilak, M. Langdetect 1.0.7: Language Detection Library Ported from Google’s Language-Detection. Available online: https://github.com/Mimino666/langdetect (accessed on 15 January 2018).
- Sajnani, H.; Saini, V. Classifying Yelp Reviews into Relevant Categories. Available online: http://www.ics.uci.edu/ vpsaini/ (accessed on 15 January 2018).
- Bird, S.; Klein, E.; Loper, E. Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit; O’Reilly Media, Inc.: Newton, MA, USA, 2009. [Google Scholar]
- Mikolov, T.; Sutskever, I.; Chen, K.; Corrado, G.S.; Dean, J. Advances in neural information processing systems. In Proceedings of the Neural Information Processing Systems Conference, Lake Tahoe, NV, USA, 5–10 December 2013; pp. 111–3119. [Google Scholar]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient estimation of word representations in vector space. arXiv, 2013; arXiv:1301.3781. [Google Scholar]
- Yu, M.; Dredze, M. Improving Lexical Embeddings with Semantic Knowledge. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, MD, USA, 22–27 June 2014. [Google Scholar] [CrossRef]
- Von Luxburg, U. A tutorial on spectral clustering. Stat. Comput. 2007, 17, 395–416. [Google Scholar] [CrossRef] [Green Version]
- Hartigan, J.A.; Wong, M.A. Algorithm AS 136: A K-Means Clustering Algorithm. J. R. Stat. Soc. C 1979, 28, 100–108. [Google Scholar] [CrossRef]
- Belkin, M.; Niyogi, P. Towards a theoretical foundation for Laplacian-based manifold methods. J. Comput. Syst. Sci. 2008, 74, 1289–1308. [Google Scholar] [CrossRef]
- Li, Y.; Chen, C.-Y.; Wasserman, W.W. Deep feature selection: Theory and application to identify enhancers and promoters. J. Comput. Biol. 2016, 23, 322–336. [Google Scholar] [CrossRef] [PubMed]
- Zou, H.; Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B 2005, 67, 301–320. [Google Scholar] [CrossRef]
- Powers, D.M.W. Evaluation: From Precision, Recall and F-Measure To Roc, Informedness, Markedness & Correlation. J. Mach. Learn. Technol. 2011, 2, 37–63. [Google Scholar]
- Koren, Y.; Bell, R.; Volinsky, C. Matrix Factorization Techniques for Recommender Systems. Computer 2009, 42, 42–49. [Google Scholar] [CrossRef]
- Sarwar, B.; Karypis, G.; Konstan, J.; Riedl, J. Incremental Singular Value Decomposition Algorithms for Highly Scalable Recommender Systems. In Proceedings of the Fifth International Conference on Computer and Information Science, Seoul, Korea, 28–29 November 2002; pp. 27–28. [Google Scholar]
- Harvey, D. Pattern, Process, and the Scale Problem in Geographical Research. Trans. Inst. Br. Geogr. 1968, 45, 71–78. [Google Scholar] [CrossRef]
- Kwan, M.-P.; Weber, J. Scale and accessibility: Implications for the analysis of land use-travel interaction. Appl. Geogr. 2008, 28, 110–123. [Google Scholar] [CrossRef]
- Jelinski, D.E.; Wu, J. The modifiable areal unit problem and implications for landscape ecology. Landsc. Ecol. 1996, 11, 129–140. [Google Scholar] [CrossRef]
- Kwan, M.-P. How GIS can help address the uncertain geographic context problem in social science research. Ann. GIS 2012, 18, 1–11. [Google Scholar] [CrossRef]
- Peterson, R.; Krivo, L.; Harris, M. Disadvantage and neighborhood violent crime: Do local institutions matter? J. Res. Crime Delinq. 2000, 37, 31–63. [Google Scholar] [CrossRef]
- Bellair, P.E. Informal surveillance and street crime: A complex relationship. Criminology 2000, 38, 137–170. [Google Scholar] [CrossRef]
- Rountree, P.W.; Warner, B.D. Social ties and crime: Is the relationship gendered? Criminology 1999, 37, 789–814. [Google Scholar] [CrossRef]
- Scribner, R.; Cohen, D.A.; Farley, T.A. A Geographic Relation Between Alcohol Availability. Sex. Transm. Dis. 1998, 25, 544–548. [Google Scholar] [CrossRef] [PubMed]
- Rousseeuw, P.J. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 1987, 20, 53–65. [Google Scholar] [CrossRef]
- U.S. Census Bureau. American Community Survey 5-Year Estimates. 2016. Available online: https://factfinder.census.gov/faces/nav/jsf/pages/index.xhtml (accessed on 15 January 2018).
- Fong, E.; Gulia, M. Differences in neighborhood qualities among racial and ethnic groups in Canada. Sociol. Inq. 1999, 69, 575–598. [Google Scholar] [CrossRef]
- Jiang, B.; Ma, D.; Yin, J.; Sandberg, M. Spatial distribution of city tweets and their densities. Geogr. Anal. 2016, 48, 337–351. [Google Scholar] [CrossRef]
- Jacobs, J. The Death and Life of Great American Cities; Alexander, C., Ishikawa, S., Silverstein, M., Eds.; Macat Library: New York, NY, USA, 1961; Volume 71. [Google Scholar]
City | MSA Population (2017) | Number of Restaurants | Number of Reviews | Reviewers’ User-ID | Number of Reviewers |
---|---|---|---|---|---|
Boston | 4,836,531 | 44,597 | 172,401 | Not available | Not available |
Charlotte | 2,525,305 | 2780 | 139,188 | Available | 39,813 |
Cleveland | 2,058,844 | 3996 | 139,824 | Available | 21,939 |
DC | 6,216,589 | 8206 | 40,420 | Not Available | Not available |
Detroit | 4,313,002 | 35,823 | 81,301 | Not available | Not available |
Las Vegas | 2,204,079 | 6312 | 826,358 | Available | 275,012 |
Philadelphia | 6,096,120 | 29,045 | 91,660 | Not available | Not available |
Phoenix | 4,737,270 | 9692 | 731,744 | Available | 97,476 |
Pittsburgh | 2,333,367 | 3130 | 124,170 | Available | 33,268 |
Toronto | 6,417,516 | 11,451 | 357,940 | Available | 58,355 |
Word2Vec Output | Similarity to Classy |
---|---|
swank | 0.87688 |
trendy | 0.86152 |
chic | 0.85917 |
posh | 0.84972 |
elegant | 0.84592 |
stylish | 0.84019 |
cozy | 0.83344 |
modern | 0.80526 |
contemporary | 0.78569 |
homey | 0.77934 |
City | Predicted Matrix | Check-in Matrix | Rating Matrix |
---|---|---|---|
Charlotte | 3 | 3 | 5 |
Cleveland | 4 | 3 | 2 |
Las Vegas | 2 | 3 | 3 |
Phoenix | 5 | 3 | 3 |
Pittsburgh | 6 | 3 | 2 |
Toronto | 6 | 6 | 2 |
City | Best Method | F1 Score | Selected Features |
---|---|---|---|
Charlotte | Check-ins | 0.64244 | salty, vegetarian, creamy, hipster, divey, dessert, calamari, asparagus, vodka |
Cleveland | Check-ins | 0.70865 | sweet, spicy, hipster, tomato, lime, meat_types, vegie_types, herb_types |
Las Vegas | Predicted Ratings | 0.71563 | braised, seared, salty, creamy, intimate, classy, modern, casual, upscale, elegant, rice, soup, wine, crab, salmon, lobster, lamb, dessert, duck, cocktail, calamari, martini, ranch, steak_types, vegie_types, herb_types, hardliq_types, sofliq_types, sweet_types, asian_types, seafood_types, pos_ambience, neg_ambience, style_types |
Phoenix | Ratings | 0.50608 | spicy, upscale, wine, pos_ambience |
Pittsburgh | Check-ins | 0.62651 | crispy, vegetarian, hipster, romantic, rice, noodle, curry, sausage, cocktail, tofu, coleslaw, wing, cheesesteak, lettuce, provolone, ranch, fast_food, dressing_types, pos_ambience, style_types |
Toronto | Ratings | 0.72686 | fried, Chinese, salty, Asian, Japanese, steamed, oily, hipster, rice, beer, soup, pork, shrimp, wine, tea, noodle, seafood, cocktail, sashimi, soy, squid, milk, sesame, Fanta, meat_types, softliq_types, Asian_types, soda_types, seafood_types, ethnic_food |
Cluster | Boston, MA | Detroit, MI | Philadelphia, PA | Washington, D.C. |
---|---|---|---|---|
Cluster 1 | 16,827 | 17,226 | 13,849 | 2780 |
Cluster 2 | 27,770 | 18,597 | 15,180 | 5419 |
City | Factor | Mean Value in Cluster 1 | Mean Value in Cluster 2 | T Statistic (Absolute Value) | p Value |
---|---|---|---|---|---|
Boston, MA | |||||
Educated population ratio | 0.06 | 0.10 | 97.46 | 0.000 | |
Annual household income (USD) | 66,985.93 | 68,655.57 | 5.47 | 0.000 | |
Black/A.A. population ratio | 0.41 | 0.40 | 13.49 | 0.000 | |
White population ratio | 0.53 | 0.56 | 32.97 | 0.000 | |
Asian population ratio | 0.02 | 0.07 | 73.04 | 0.000 | |
Detroit, MI | |||||
Educated population ratio | 0.04 | 0.07 | 59.39 | 0.000 | |
Annual household income (USD) | 50,359.52 | 61,600.40 | 40.69 | 0.000 | |
Black/A.A. population ratio | 0.41 | 0.38 | 21.84 | 0.000 | |
White population ratio | 0.55 | 0.60 | 35.75 | 0.000 | |
Asian population ratio | 0.01 | 0.03 | 43.08 | 0.000 | |
Philadelphia, PA | |||||
Educated population ratio | 0.05 | 0.09 | 72.51 | 0.000 | |
Annual household income (USD) | 55,067.55 | 64,436.73 | 25.42 | 0.000 | |
Black/A.A. population ratio | 0.39 | 0.35 | 24.14 | 0.000 | |
White population ratio | 0.55 | 0.62 | 41.79 | 0.000 | |
Asian population ratio | 0.03 | 0.05 | 33.30 | 0.000 | |
Washington, D.C. | |||||
Educated population ratio | 0.11 | 0.15 | 25.48 | 0.000 | |
Annual household income (USD) | 53,222.42 | 80,220.32 | 28.74 | 0.000 | |
Black/A.A. population ratio | 0.36 | 0.22 | 41.42 | 0.000 | |
White population ratio | 0.55 | 0.68 | 23.94 | 0.000 | |
Asian population ratio | 0.02 | 0.04 | 15.19 | 0.000 | |
All four cities combined | |||||
Educated population ratio | 0.06 | 0.09 | 134.74 | 0.000 | |
Annual household income (USD) | 57,322.86 | 66,673.53 | 51.06 | 0.000 | |
Black/A.A. population ratio | 57,322.86 | 0.37 | 29.46 | 0.000 | |
White population ratio | 0.40 | 0.60 | 69.42 | 0.000 | |
Asian population ratio | 0.54 | 0.05 | 91.96 | 0.000 |
© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Rahimi, S.; Mottahedi, S.; Liu, X. The Geography of Taste: Using Yelp to Study Urban Culture. ISPRS Int. J. Geo-Inf. 2018, 7, 376. https://doi.org/10.3390/ijgi7090376
Rahimi S, Mottahedi S, Liu X. The Geography of Taste: Using Yelp to Study Urban Culture. ISPRS International Journal of Geo-Information. 2018; 7(9):376. https://doi.org/10.3390/ijgi7090376
Chicago/Turabian StyleRahimi, Sohrab, Sam Mottahedi, and Xi Liu. 2018. "The Geography of Taste: Using Yelp to Study Urban Culture" ISPRS International Journal of Geo-Information 7, no. 9: 376. https://doi.org/10.3390/ijgi7090376
APA StyleRahimi, S., Mottahedi, S., & Liu, X. (2018). The Geography of Taste: Using Yelp to Study Urban Culture. ISPRS International Journal of Geo-Information, 7(9), 376. https://doi.org/10.3390/ijgi7090376