Spatial Footprints of Human Perceptual Experience in Geo-Social Media

Lee, Jun; Ogawa, Hirotaka; Kwon, YongJin; Kim, Kyoung-Sook

doi:10.3390/ijgi7020071

Open AccessArticle

Spatial Footprints of Human Perceptual Experience in Geo-Social Media

¹

Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology, Koto-ku, Tokyo 135-0064, Japan

²

Department of Electronics and Information Engineering, Korea Aerospace University, Goyang-si, Gyeonggi-do 10540, Korea

^*

Author to whom correspondence should be addressed.

ISPRS Int. J. Geo-Inf. 2018, 7(2), 71; https://doi.org/10.3390/ijgi7020071

Submission received: 25 December 2017 / Revised: 15 February 2018 / Accepted: 17 February 2018 / Published: 23 February 2018

Download

Browse Figures

Versions Notes

Abstract

:

Analyses of social media have increased in importance for understanding human behaviors, interests, and opinions. Business intelligence based on social media can reduce the costs of managing customer trend complexities. This paper focuses on analyzing sensation information representing human perceptual experiences in social media through the five senses: sight, hearing, touch, smell, and taste. First a measurement is defined to estimate social sensation intensities, and subsequently sensation characteristics on geo-social media are identified using geo-spatial footprints. Finally, we evaluate the accuracy and F-measure of our approach by comparing with baselines.

Keywords:

social sensory knowledge; sensation information; text mining; classification; geo-spatial analysis

1. Introduction

Social media has become a critical part of the modern information ecosystem in various business fields, such as marketing, reputation, sales, and price competition [1,2]. Many companies have actively used social media in their marketing activities by uploading multimedia adverising stops on YouTube (https://www.youtube.com/), running real-time promotions on Facebook (https://www.facebook.com/) and Twitter (https://twitter.com/), or posting detail information about their products on Wikipedia (https://www.wikipedia.org/) although social media marketing has drawbacks such as backfires, a user abusing, or a malicious rumor [3,4]. Accordingly, business intelligence based on social media big data analysis has been emerging in both the academic and business communities over twhe past decade [5,6,7,8]. Social media analysis helps companies understand what customers are thinking, how the market is changing, and even what their competitors are doing [9,10,11,12].

Comprehending individual preferences and public popularity results in successful marketing and product strategies. For this reason, opinion mining and sentiment analyses are the most active topics in social media big data as shown in [13,14,15,16]. Opinion is a primary influencer of human activities and decision-making. Subjectivity classifications indicating whether opinions are positive, negative, or neutral help companies identify how consumers perceive their products, services, and brands as well as indirectly obtain feedback from consumers. However, it was found that human sensory perception, called sensation information, is a primal factor affecting human behaviors preceding opinions, sentiments, emotions, and evaluations [17,18,19,20]. Sensation information has been applied to marketing to design new products and imprint company brands [21,22,23,24,25,26]. For example, human interaction devices such as Oculus lift (https://www.oculus.com/rift/) and Gloveone (https://www.neurodigital.es/gloveone/) fundamentally considered human sensory experiences in terms of their design and functionality. In [27], a language processor bridging the gap between perception and action was created to enable intelligent human–robot interactions during sensorimotor tasks. Although sensory experiences directly affect decision-making, actions, and sentiment formation in humans, few studies have considered sensation information itself in text mining.

This paper focused on sensation analysis by extracting, quantifying, and identifying sensation information from sight (ophthalmoception), hearing (audioception), touch (tactioception), smell (olfacception), and taste (gustaoception) as introduced in [28]. Human perception is the process of selecting, organizing, and interpreting sensory information into meanings or concepts on an individualized basis. For example, the statement, “I went to a restaurant with my friend a few days ago. It was so crowded, and music was played at a high volume. Although the food was great, I could not clearly hear my friend’s voice. I do not want to go there again”, presents a negative opinion regarding the restaurant. When we look at the reasoning behind this negative evaluation, auditory perception is an important criterion. Human behavior is influenced by unconscious states as well as conscious thinking, as shown in Figure 1.

However, finding sensory perceptions in natural language is a rudimentary task with a small set of lexicons and training corpora. Furthermore, word sense disambiguation (WSD) and metaphor expression create difficulties in distinguishing sensations from opinions or sentiments, such as in, “He gave me a cool reception” and “All people are blind to their own mistakes”. Here, quantitative features of sensation were extracted from social media with geographic locations based on a previous study in [28] and geo-spatial pattern differences in social perceptual experiences were investigated. The main contributions of this paper are threefold:

Sensation feature analysis: The sensation measurements assigned to sight, hearing, touch, smell, and taste in sentences were addressed with sensation word sets and WordNet [29]. Based on the five senses, unstructured text was turned into structure data for sensation classification. This classification contributed to the construction of a sensation corpus for studying social perceptual experiences by lowering costs and complexity.
Geo-spatial footprint analysis: The natural language of social media was considered as a kind of sensor data, and we tried to exploit this data to discover a geo-spatial knowledge. For this purpose, a domain area was divided into several sub-areas to identify sensation patterns from geo-spatial footprints on social media. Additionally, strong trends in sensation features were identified for each area and their patterns were compared.
Comprehensive evaluation: To evaluate the proposed classification approach, available alternatives (random forest, support vector machine (SVM), multilayer perceptron (MLP), convolutional neural network (CNN), and recurrent convolutional neural network (RCNN)) were performed with the proposed sensation features and general word-based features. These results identified the best-performing combinations for classification.

The remainder of this paper is organized as follows. Section 2 briefly reviews related work encompassing social media analysis methods and several related techniques. In Section 3, a measurement for sensation classification is presented followed by the sensation features of geo-social media data and the resulting evaluations of sensation classification using several baselines, including neural network methods in Section 4. Finally, Section 5 concludes the paper and outlines future work.

2. Related Work

Social networks play a crucial role in understanding daily lives as information channels outside of traditional communication tools. Social media and mobile technologies have begun delivering human experiences such as observations, emotions, judgments, and even behaviors. The popularity of social media has boosted its utilization in advancing business information technologies to increase sales functions by analyzing unstructured contents. A comprehensive survey of social media analysis is addressed in [30]. In particular, sentiment analysis (known as opinion mining or emotion AI) from unstructured textual data was adapted early in business intelligence for garnering feedback on human experiences with products and services. Sentiment analysis is intended to classify polarities from text, such as positive, negative, or neutral, as addressed in [31]. One early work in sentiment analysis by Turney [32] and Pang et al. [33] attempted to detect polarities in product and movie reviews using machine learning techniques. More recently, fine-grained emotion (happy, sad, etc.), mood (cheerful, depressed, etc.), interpersonal stance (friendly, distant, etc.), attitude (like, love, etc.), and personality trait (nervous, anxious, etc.) analyses have been conducted as machine learning techniques developed. These have been widely applied to real-life services such as personal recommendation [34,35,36], stock prediction [37,38,39], and political campaigns [40,41,42]. Lisette et al. [43] investigated the relation between brand popularity and numbers of fans as an indication of brand recognition on social media. In [44], Linda et al. attempted to identify consumer-brand engagement (CBE) to reflect the nature of interactive brand relationships (including social interactions, emotional, and cognitive clues, and brand-related behaviors) among particular consumers in social media. Furthermore, Jari et al. [45] and Carolyn et al. [46] addressed the opportunities and challenges of utilizing social media in business-to-business relationships and customer relationship management, respectively.

There exist two main approaches to sentiment analysis: learning- and lexicon-based. In learning-based approaches, sentiment classification can be formulated as a supervised learning problem with three classes: positive, negative, and neutral [31]. Approximately all supervised learning methods can be used for sentiment classification, including naive Bayesian and SVM. Pang et al. [33] took this approach to classify movie reviews into two categories: positive and negative. Their work demonstrated that using unigrams as features in classification performed well with either naive Bayesian or SVM. Tan et al. [47] used opinion words to label a part of valuable examples and then learn a new supervised classifier based on labeled ones.

Lately, deep learning has become a highly important method for analyzing text and image data. State-of-the-art deep learning methods such as the CNN, RNN, and recurrent convolutional neural network (RCNN) are continually being developed for sentiment analysis. Previously, RNN techniques were the most popular ones for sentiment analysis and were dramatically improved by using long short-term memory (LSTM), developed by S. Hochreiter et al. [48]. D. Tang et al. [49] introduced a neural network model (LSTM-GRNN) for document level sentiment classification. Their model performed end-to-end learning more effectively than SVM and CNN on four review datasets. In past research, CNNs have primarily been applied to image processing; however, Y. Kim [50] proposed a CNN for sentiment text classification with remarkable experiment results. S. Poria et al. [51] employed a deep CNN to extract features from text for utterance level sentiment analysis. Furthermore, S. Lai et al. [52] combined recurrent and convolutional sentiment analysis methods for short texts. They took advantage of the local features generated by CNNs and long-distance dependencies learned via RNNs to overcome constraints caused by short text.

Word sense disambiguation (WSD) is worthy of notice in natural language processing to identify the meaning of words in context in a computational manner [53]. There are several approaches to solve WSD problems nowadays, including polysemic, metonymic, and metaphorical extensions [54,55,56]. Of course, the sentence that contains sensation words has the ambiguities of meanings. For example, there are two example sentences: (1) “Blue Cheese is quite pungent but it’s delicious”; (2) “She is the author of a pungent political comedy”. The pungent word is commonly used in the representation of sensory information having “a sharply strong taste or smell”. However, it is used to metaphorically represent “a sharp and caustic quality of comment, criticism, and humor”. This demonstrates that words can be interpreted in multiple ways depending on their linguistic context. In order to address the WSD to identify the sense of a polysemic and metaphorical word, there are knowledge based approaches, machine learning based approaches, and hybrid approaches. Knowledge based approaches rely on knowledge resources or dictionaries such as WordNet and thesaurus and mainly use hand coded and grammar rules for disambiguation. However, machine learning based approaches rely on corpus evidence and train a probabilistic or statistical model using tagged or untagged corpus. Hybrid approaches mix them to use corpus evidence as well as semantic relations from the knowledge resources.

Furthermore, sentiment lexicons are crucial resources for analyzing polarities in sentiment analysis. Lexicon-based approaches are based on the sentiment orientation sum of each word or phrase. P. Palanisamy et al. [57] described a method of building lexicons from Serendio taxonomy, and then classified tweets as positive or negative based on the contextual sentiment orientation of words from their lexicon. Melville et al. [58] suggested a framework incorporating lexical knowledge into supervised learning to improve accuracy; additionally, Taboada et al. [59] used known opinion words for sentiment classification. Several open lexicons such as Sentiwordnet [60], Q-wordnet [61], WordNet-Affect [62] have already been developed to support sentiment analysis. Such lexicon-based approaches are highly effective, especially when no supervised datasets are available.

This study proposed a lexicon-based approach for the sensation classification of Twitter posts, because few labeled examples exist. Sensation information is a property dependent on the perception of human sensory organs and the processing of their information in the brain. According to the psychology studies shown in [63], decision-making is based on sensation information from the unconscious mind. Hence, sensation information affects human decision-making prior to cognitive thinking. As the usage of sensation information emerges in marketing and business areas, some companies strive to improve their brand images using a method called sensory branding. For example, pairing sound with food and drink has been scientifically proven to enhance the human experience of flavor. The study in [64] discovered that high-frequency sounds help to stimulate sweet taste sensations, whereas low sounds bring out bitterness. British Airways (https://www.britishairways.com) is banking on the sensory science to stand out in the premium market. Furthermore, Dunkin’ Donuts (https://www.dunkindonuts.com) stimulates the sense of smell to entice visitors by releasing a coffee aroma into the bus whenever the company jingle played. Even though sensation information may clearly be considered an important feature reflecting the five senses using human language, very few works have been published estimating or discovering the sensation information from text. If the trends of human perceptual experiences can be understood using social media, these trends can be rapidly adopted to develop new human-centered designs and systems.

3. Methodology

This section explains how sensation sentences representing social perceptual experiences are classified. First, a dictionary of representative seed words was built to include all five senses: sight (ophthalmoception), hearing (audioception), touch (tactioception), smell (olfacception), and taste (gustaoception). Next, the intensity of each sensation feature was calculated based on this dictionary and WordNet, and then, finally, sensation sentences were identified. Figure 2 shows the classification procedures of Geo-partitioning, Tagging, Enriching, and Scoring.

3.1. Lexicon Resources

One important lexicon resource is a set of seed words containing sensory activities such as sight, hearing, touch, smell, and taste. For this paper, the sensation word set was built from words manually collected using Google search results from the queries “sensation word” and “sensory word”. The resulting word set included 1222 words such as lucent and murky (for sight), sizzle and whir (for hearing), chill and tingle (for touch), reek and stench (for smell), and sour and ambrosial (for taste).

WordNet, a computational lexicon of English based on psycholinguistic principles, was used as a resource for dealing with WSD. It encodes concepts in terms of sets of synonyms (called synsets), and the latest version contains approximately 155,000 words organized in over 117,000 synsets. In order to address WSD and identify the sense of a polysemic word, each word in this paper was enriched with WordNet synsets.

3.2. Sensation Feature Extraction

3.2.1. Geo-Partitioning

The primary task of this work was to partition social media data into several small geographical areas. One aim was to identify sensation intensities in terms of geo-spatial location, and thus data had to be grouped regarding spatial coordinates. For this work, the whole world was partitioned equally into uniquely numbered

C_{i d}

fixed-size cells.

With development of mobile applications, the social media users could generate social data, which have attached GPS coordinate location information, including latitude and longitude. Among social media data, we focused on the tweets from Twitter, which is one of most popular social media service. Thus, Twitter data could be partitioned off into cells using the naming protocol

G_{i d} = {D T, C_{i d}}

, where

G_{i d}

was a partitioned Twitter datum associated with

D T

(the tweet text) and

C_{i d}

(the correspoding spatial cell); furthermore,

D T = {T_{1}, T_{2}, \dots, T_{n}}

represented a set of n number geo-partitioned tweet texts.

3.2.2. Tagging

In order to retrieve a corresponding part-of-speech (POS) word synset, tweet texts were preprocessed via morpheme analysis using the POS tagger for Twitter [65]. This study accounted only for Noun, Verb, Adjective, and Adverb classifications among the eight parts of speech in English. These are the four parts of speech primarily used in sentences associated with sensation features: sight, hearing, touch, smell, and taste. For example, the quotation “I smell a subtle fragrance of perfume. It is never smelled so sweet!” passes only the following POS results on to the next step: {(smell, Verb), (subtle, Adj), (fragrance, Noun), (perfume, Noun), (never, Adv), (smell, Verb), (sweet, Adj)} (hereafter referred to as tagging-pairs).

3.2.3. Enriching

The third step of sensation feature extraction was enriching POS results by attaching WordNet synsets to decrease the effect of WSD and reveal the appropriate meaning of words. In this process, the most useful semantic relations were selected from WordNet in accordance to the POS of each word. Table 1 shows the selection of semantic relationships from WordNet corresponding to each POS.

Let

T_{e} = {t_{1}, t_{2}, \dots, t_{k}}

be a set of tagging-pairs with k elements and

t_{i} = (w_{i}, p o s_{i}) : 1 \leq i \leq k

be the ith word (w) and POS tag (pos) pairing. The function

H_{t}

was defined to return a set of words containing similar meanings from the WordNet synsets of w via the pos selected semantic relationships, including the w from tagging-pair t. For example, the H function enriches the word smell in tagging-pair t = (smell, Verb) into the set of words

H_{t}

= {smell, odorize, odourise, scent, radiate, project, ache, smack, reek, perceive} by traversing the selected semantic relation {direct hypernym, sister term} corresponding to the POS tag Verb.

3.2.4. Scoring

The scoring task employed the Lesk Algorithm [66] to calculate the gloss overlap between the sense definitions of two or more target words. Given the context of two words, (w1, w2), the senses of target words whose definitions had the highest overlap (specifically, words in common) were assumed to be the correct ones. As seen in much of the latest research based on this method, sensation measures were defined by optimizing the use of WordNet and the Lesk algorithm.

A single sensation word set was referred to as

S e n s a t i o n_{f}

, where f was a feature of sight, hearing, touch, smell, or taste. Given a sensation word set

S e n s a t i o n_{f}

of a feature f, the sensation intensity of the f feature for a tagger-pair, t, with the results of the enriching task,

H_{t}

, was calculated as follows:

Definition 1. Sensation Intensity

$I_{S e n s a t i o n_{f}} (t) = \sum_{e \in H_{t}} S_{e},$

(1)

where e was an element (enriched word) from function

H_{t}

and

S_{e}

was the indicator function defined below:

S_{e} = \{\begin{matrix} 1 * σ, & if e . w \in S e n s a t i o n_{f} \\ 0, & otherwise, \end{matrix}

(2)

where

σ

was a weight factor for normalizing the occurrences of words in the sensation word sets and text. First the number of each sensation word set,

| S e n s a t i o n_{f} |

was accounted for because a sensation word set containing more words may bring a higher value of sensation intensity. Specifically, the probability of an enriched word being included in the word set

S e n s a t i o n_{f}

was increased. To normalize this imbalance when computing the sensation intensities of each feature, the weight factor in the indicator function was adjusted using the ratio between the number of the word in each sensation word set using the following equation:

σ = σ_{S} = \bar{| S e n s a t i o n_{f} |} / | S e n s a t i o n_{f} |,

(3)

where

| S e n s a t i o n_{f} |

was the number of the targeted word in feature f (the sensation word set) and

\bar{| S e n s a t i o n_{f} |}

was the average word count from all sensation word sets. Thus, the weight was the reciprocal of the ratio giving high weights to a small sensation word set.

Second, the importance of a word was regarded as a weight factor using tf-idf [67]. The tf-idf measure is a popular term weighting schemes for assessing imbalances in word occurrences in a document corpuses. For a copus of tagging-pair sets

D T = {T_{1}, T_{2}, \dots, T_{n}}

, the scoring in this paper used a weight factor defined by the following equation:

σ = σ_{S} \times (t f_{w, T_{i}} \times i d f_{w}),

(4)

where

t f_{w, T_{i}}

was the number of occurrences of a word, w, in a tagging-pair set,

T_{i}

, in

D T

, representing a text, and

i d f_{w}

was the number of tagging-pair sets containing the word in

D T

.

Given a sensation word set,

\forall f \in F =

(

s i g h t

,

h e a r i n g

,

t o u c h

,

s m e l l

,

t a s t e

) :

S e n s a t i o n_{f}

, the sensation intensity for a set of tagging-pairs,

T_{i} = {t_{i 1}, t_{i 2}, \dots, t_{i k}}

, was calculated based on the Definition 1 and indicator function in Equation (2) as follows:

I_{S e n s a t i o n_{f}} (T_{i}) = \sum_{t \in T_{i}} \sum_{e \in H_{t}} S_{e} .

(5)

Finally, text were represented by vectors in a five-dimensional space at each sensation feature, for instance,

I ({(h o n e y, A d j),

(s m e l l, V e r b)})

=

(I_{s i g h t},

I_{h e a r i n g},

I_{t o u c h},

I_{s m e l l},

I_{t a s t e})

=

(4, 3, 10, 3, 8)

. Namely, the sensation intensity could indicate how the five sensation features were reflected in a sentence or corpus. An experimental result demonstrating the differences in sensation feature intensities with respect to a corpus topic is shown in Section 4.

3.3. Sensation Classification

This study simply classified texts into two groups: True if they represented the sensation information and False otherwise. This binary classification was considered as there were insufficient data to train for individual sensation features. Classifying each sensation feature individually remains a task for future work after a sufficient amount of corpus evidence is collected. In order to identify the existence of sensation representation in a text with the described sensation intensity features, an SVM variant was employed. The SVM classifier has shown substantial performance gains in many binary classification problem domains [68]. First, a labeled training set m was manually prepared:

T S = {(\vec{x_{1}}, y_{1}),

(\vec{x_{2}}, y_{2}),

\dots,

(\vec{x_{m}}, y_{m})}

, where the label

y_{i} =

±1 indicated one of two classes (True:+1, False:−1) to which the ith input feature

\vec{x_{i}}

belonged. It was assumed that the training set was linearly separable by a hyperplane (decision boundary) of the form:

w^{T} x + b = 0

, where w and b represented the weight vector and bias of the plane, respectively. The SVM classifier then found the best choice of hyperplane to maximize margin 2/

∥ w ∥

from both groups (sensation and non-sensation).

For the input features

\vec{x}

, an additional dimension (accumulation of intensities) was included with the five features (sight, hearing, touch, smell, and taste) because of the binary classification problem. Each

\vec{x}

was therefore a six-dimensional vector found by computing the intensity of each sensation feature,

\vec{x}

= (

I_{s i g h t}

,

I_{h e a r i n g}

,

I_{t o u c h}

,

I_{s m e l l}

,

I_{t a s t e}

,

I_{s u m}

), where

I_{s u m} = \sum_{f \in (s i g h t, h e a r i n g, t o u c h, s m e l l, t a s t e)} I_{f}

. However, two issues occurred when using the original sensation intensity values for classification. First, the number of synsets in WordNet has a side effect. Words with more synsets may have increased sensation intensities due to the larger enriched word set calculated by Equation (1). For example, the word clear can be used as a Noun, Verb, Adjective, or Adverb and, has multiple meanings for the same POS (i.e., 24 meanings under Verb in WordNet). In order to reduce this effect, the sensation intensity was redefined from Equation (1) as follows:

Definition 2. Normalized Sensation Intensity

$I_{S e n s a t i o n_{f}}^{*} (t) = \sum_{e \in H_{t}} S_{e} / | H_{t} |,$

(6)

where

| H_{t} |

was the number of words enriched by function

H_{t}

with the relevant semantic relationships in the enriching task.

Second, the length of each text was considered because the sensation intensity is also influenced by the number of words in this measurement per Equation (5). The more words a text includes, the larger the derived sensation feature intensity score becomes. Thus, the equation was modified by dividing the sensation feature intensity score by the number of elements in the tagging-pair set T as follows:

I_{S e n s a t i o n_{f}}^{*} (T) = \sum_{t \in T} I_{S e n s a t i o n_{f}}^{*} (t) / | T |,

(7)

where

| T |

was the number of elements in the tagging-pair set. Accordingly, the training and test data were transformed into six-dimensional vectors using the normalized sensation intensity of features for this classification problem.

4. Experiments and Evaluations

The following experiments were performed with two purposes of assessment. One was to evaluate the accuracy of sensation features derived from the Twitter corpus (http://noisy-text.github.io/2016/index.htm) with several machine learning techniques and baselines. The other was to reveal the regional characteristics of social perceptual experiences. In concrete terms, this study aimed to observe how social sensation reacts to, among the five senses, trends in the same time period among geo-spatial groups. Particularly, we tried to discover a usage pattern of sensation expression between geo-spatial groups. The tweet text was considered as a kind of human sensor data to reveal a geo-spatial knowledge in terms of the sensation information.

4.1. Dataset

This experiment used the Twitter corpus among other social medias. This dataset consisted of approximately seven million tweets including tweet text, tags, urls, time, and geo-location information. Two hundred thousand tweets were randomly selected from all tweets posted between 2007 and 2015 with tweet text, time and geo-location information. As there is no corpus for sensation classification, a judgment was made as to whether these tweets contained sensation information or not. Only tweets annotated as the same decision from more than two out of three individuals were taken into account. As a result, approximately fifteen thousand tweets were selected for the training and test datasets from the overall dataset.

4.2. Classification Results

First, the performance of the proposed classification based on sensation features was determined. To simplify comparisons with the selected baselines, the Twitter corpus was rearranged into two types of input features: sensation features and general word-based features. Concretely, we identified that the sensation feature from the proposed method can bring forth a good performance using simple classification algorithms (i.e., Naive Bayes, decision tree, and SVM) by comparing with several state-of-the-art techniques (i.e., neural network algorithms), when we have not enough a training data. For experiments with word-based feature inputs, baselines such as the random forest, SVM, MLP, CNN, and RCNN were employed. To evaluate our sensation feature, the naive Bayes [69], decision tree (C4.5) [70], random forest [71], and SVM algorithms were utilized. In this research, the naive Bayes, C4.5 and random forest algorithms were employed and evaluated using the Weka [72] toolkit. Conversely, the Tensorflow (https://www.tensorflow.org/) library was used to implement the MLP [73], CNN, and RCNN algorithms. In this study, the CNN and RCNN algorithms were adopted from Y. Kim [50] and S. Lai et al. [52] respectively. Additionally, the libsvm [74] algorithm was applied as an SVM classifier for sensation binary classification. In SVM classification, higher sensation intensities were classified as the positive (or sensation) class. To verify these classification results, all experiments were evaluated using 10-fold cross validation as there was a lack of supervised sensation data.

Table 2 summarizes the classification results in terms of Accuracy measures, such as Accuracy and F1 measure, which considers both precision and recall. In this table, the results of using general word features are in the upper column, whereas the lower column illustrates sensation feature results using the TF-IDF×sensation weight. In the upper results, the neural network classifiers, (MLP, CNN, and RCNN), achieved better performance than the traditional methods, for both F1 and accuracy. However, the greatest encouraging result is the TF/IDF×sensation weight classifier given by Equation (4), which showed 80.24% accuracy and an 80.1% F1 measurement in the averaged evaluation values. Although the word-feature accuracies were slightly better when the RCNN classifier was used relative to the proposed sensation features (80.1% of F1 measure and 80.39% of accuracy), the sensation feature classifiers generally demonstrated good performance even with traditional methods, such as C4.5 and random forest. These experiments were performed with insufficient data to fully assess classification performance; however, it was shown that sensation features may be entitled to further consideration as input features for sensation classification.

4.3. Geo-Spatial Analysis of Sensation Intensity

The geo-spatial sensation intensity distribution was investigated with approximately 80% accuracy across the glove using seven million tweets. Before geo-spatial analysis, it was verified that the proposed measure appropriately reflected real-world sensation intensities. Figure 3 includes two pie-charts showing tweet sensation intensities and the sense importance ratio of human behaviors from survey results [75]. From Figure 3b, sight has the greatest effect on human behavior (74%), followed by hearing (8%), touch (7%), taste (6%), and smell (5%). The sensation intensity ratio found using the proposed measurement, as shown in Figure 3a, is highly similar to Figure 3b. Thus, the proposed measurement can be considered a reasonable sensation intensity measurement within sensation classified tweets, as in the following examples:

Sight: My eyes are on black haired girls with blue and green eyes.
Hearing: Music really loud from the next door.
Touch: It’s such a cold morning.
Smell: The room smelt odour like rotten eggs and spoiled tomatoes.
Taste: This root beer is definitely sweet but doesn’t contain alcohol.

Additionally, it was verified that the weight functions (above) could appropriately normalize biased sensation intensities. Sensation intensity normalization was essential for identifying the importance of each intensity. In the Section 3, two weight functions were proposed to normalize the occurrences of words in sensation word sets and tweet texts. As shown in Figure 3a, the sensation intensity ratio is extremely biased toward sight sensations. However, sensation intensities, as shown in Figure 4a, demonstrated that overvalued sensation intensities were reduced more than the non-weight case in Figure 3a. In addition, the superpositions of sensation weights calculated with TF/IDF weights had approximately equal sensation intensities proportional to each other, as shown in Figure 4b. As a result, it was concluded that the weight functions were able to properly calibrate the sensation intensity biases. Thus, the geo-spatial analysis experiments utilized sensations with the TF/IDF weight function applied to clearly distinguish the ratio of sensation intensities.

In our experiment, we discovered the usage pattern of sensation expressions in two aspects. First, the geo-spatial sensation intensity distribution was estimated between English-speaking region and non-English-speaking region to identify a different usage proportion of sensation expression. Second, we revealed how sensation expressions react to natural phenomena, such as the temperature during summer season in United States.

For the first purpose, the experiment was conducted to reveal the sensation intensities of each geo-spatial group. The top seven groups selected in terms of tweet volume was considered as English-speaking region, such as the United States, Australia, and Western Europe (including Britain). From Figure 5, there was no significant difference in sensation intensity between English-speaking geo-spatial groups. However, one remarkable aspect of this experiment was that the sense of touch was more verbalized than hearing in social media, dissimilar to general statistics. Next, the scope of sensation intensity analysis was extended to the rest of the world, including non-English-speaking region. In the geo-partitioning task, the globe was separated into uniquely numbered cells with corresponding tweets. As shown in Figure 6a, some geo-spatial groups manifested very different sensation intensities, especially groups 6, 7, and 8. These groups were usually located in non-English-speaking countries such as Russia and Eastern European countries. They showed a distinctive pattern containing a large proportion of smell and taste sensation features relative to English-speaking groups. Furthermore, Latin America and Africa (including groups 21, 22, 23, and 24) showed a greater sensitivity to touch relative to other senses, contrasting with English-speaking countries. As a result, it can be hypothesized that there are distinctive differences in the usage of sensation expressions between English-speaking and non-English-speaking countries, at least in social media. However, the experiment data, i.e., tweets, had serious inequalities for each region as shown in Figure 6b. This imbalance is an important factor affecting validity and reliability of the experiment. To identify misgivings from data volume, another experiment was proceeded with English tweets in the United States.

Now, the experiment focused on the English-speaking country (i.e., United States) to reveal how sensation expressions react to natural phenomena in different geo-spatial groups. A hypothesis of this experiment was that the hotter a place is, the more touch sensation intensity a place will include. This assumption seems reasonable, since touch sensation is usually appeared by such as hot, cool, warm, and humid words when we feel a temperature deviation. To verify this hypothesis, tweets were separated into the geo-spatial groups (i.e., the United States by States), and then sensation intensities were estimated for the comparison with the average temperatures for every state in the United States. We collected the average temperature of the summer season (i.e., June to August) during 2010 to 2014 from the National Centers for Environmental Information (NCEI) of National Oceanic and Atmospheric Administration (NOAA) (https://www.ncdc.noaa.gov/cag/). Moreover, the usage statistics of Twitter was probed in terms of the tweet volume and the ratio of sensation expression to determine whether the tweet volume contributes to the sensation expression.

From Figure 7, the hypothesis proved to be reasonable, since the usage rate of the touch sensation intensity looked very similar to the average temperatures in most states. For example, California, Texas, and Florida showed higher temperatures than the northern states (e.g., Montana, North Dakota, or Minnesota) as shown in Figure 7a. Likewise, the usage rate of the touch sensation intensity strongly appeared in the same states (i.e., California, Texas and Florida) as shown in Figure 7b. From this result, we could consider that the sensation intensity has the potential to be used for the analysis of natural phenomena. Meanwhile, the usage statistics of Twitter could definitely influence this kind of experiments. This result could be written off as untrustworthy, if there was a critical imbalance in the usage statistics of Twitter or data volume, such as Figure 7b. To determine the reliability of our experiments, we surveyed the volume of tweet as well as the rate of sensation expression over all states during that time period. In Figure 8a, the green color indicated the tweet volume, and the displayed percentage values on the states showed the rate of sensation expression in the total volume of tweet. The average rate of sensation expression usage was 19.3%, and the standard deviation is 3.78 as denoted by the box plot graph in Figure 8b. The remarkable thing was that the sensation expression of all states remained the similar ratios, even if the total volumes were quite different between states. In addition, one interesting fact discovered is that the high temperature states (i.e., California, Texas, and Florida) represented high ratios of the sensation expression, such as 32%, 25%, and 30% respectively.

From our experiments, we could make an assumption from these results that sensation expression reacts to natural phenomena. Thus, the sensation information could be considered as a fine measure to estimate human behaviors. This may be worth considering in some fields of practical application such as marketing or business. For example, a marketing company could optimize advertisement themes or catch phrase designs following the sensation intensities of specific geo-spatial locations.

5. Conclusions and Future Work

Sensation information is not only closely connected to human behavior but also makes it easier to understand social perceptual experiences. This paper focused on representations of sensation representations in social media, including sight (ophthalmoception), hearing (audioception), touch (tactioception), smell (olfacception), and taste (gustaoception). A knowledge-based approach was proposed utilizing the Lesk algorithm with WordNet and a manually collected sensation word set (including terms such as angular, breezy, cry, delicious, and sweet). A measurement was defined for sensory representation in social media sentences, called “sensation intensity” and machine learning techniques were applied to conduct a binary classification using the defined sensation features. Experiments were conducted using data from Twitter to determine the performance of our classification approach relative to several traditional and neural network baselines, including SVM, random forest, MLP, CNN, and RCNN. Furthermore, the importance of sensation intensities according to geo-spatial location was analyzed. This analysis determined that sensitivity to different senses varies distinctively between English-speaking and non-English-speaking countries, including Latin America and Eastern European countries. In addition, we identified geo-spatial characteristic of sensation intensities on the United States by comparing with natural phenomena and sensation expression. From this, we identified that geo-spatial characteristic can be represented by the sensation intensities. This also could be applied to marketing or business strategies for optimizing target marketing.

In future, these experiments will continue with a larger dataset to achieve more robust results. Additionally, fine-grained classifications will be extended to identify each of the five sensory types of social perception from text. Furthermore, an elaborate sensation measurement based on neural network techniques combined with sensation pre-training will be performed. Finally, a social sensory knowledge or database will be constructed for sensation analyses in the same vein as SentiWordNet [60].

Acknowledgments

This work was partially supported by the GRRC program of Gyeonggi province [2017-14-019 Advanced The Development of Intelligent Service Techniques converged with Cm Level Location Information], Japan Society for the Promotion of Science (JSPS) [KAKENHI 15K15995], and based on results obtained from a project commissioned by the New Energy and Industrial Technology Development Organization (NEDO).

Author Contributions

Jun Lee and Kyoung-Sook Kim conceived of the presented idea. Jun Lee developed the theory and performed the computations. Hirotaka Ogawa and YongJin Kwon helped supervise the project. Kyoung-Sook Kim supervised the findings of this work. All authors provided critical feedback and helped shape the research, analysis and manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

Kietzmann, J.H.; Hermkens, K.; McCarthy, I.P.; Silvestre, B.S. Social media? Get serious! Understanding the functional building blocks of social media. Bus. Horiz. 2011, 54, 241–251. [Google Scholar] [CrossRef]
Hanna, R.; Rohm, A.; Crittenden, V.L. We’re all connected: The power of the social media ecosystem. Bus. Horiz. 2011, 54, 265–273. [Google Scholar] [CrossRef]
Constantinides, E. Social Media/Web 2.0 as Marketing Parameter: An Introduction. In Proceedings of the 8th International Congress Marketing Trends, Paris, France, 16–17 January 2009; pp. 15–17. [Google Scholar]
Neti, S. Social media and its role in marketing. Int. J. Enterp. Comput. Bus. Syst. 2011, 1, 1–15. [Google Scholar]
Chen, H.; Chiang, R.H.L.; Storey, V.C. Business Intelligence and Analytics: From Big Data to Big Impact. MIS Q. 2012, 36, 1165–1188. [Google Scholar]
Zeng, D.; Chen, H.; Lusch, R.; Li, S.H. Social Media Analytics and Intelligence. IEEE Intell. Syst. 2010, 25, 13–16. [Google Scholar] [CrossRef]
Mangold, W.G.; Faulds, D.J. Social media: The new hybrid element of the promotion mix. Bus. Horiz. 2009, 52, 357–365. [Google Scholar] [CrossRef]
Minelli, M.; Chambers, M.; Dhiraj, A. Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today’s Businesses; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2013. [Google Scholar] [CrossRef]
He, W.; Zha, S.; Li, L. Social media competitive analysis and text mining: A case study in the pizza industry. Int. J. Inf. Manag. 2013, 33, 464–472. [Google Scholar] [CrossRef]
Malthouse, E.C.; Haenlein, M.; Skiera, B.; Wege, E.; Zhang, M. Managing Customer Relationships in the Social Media Era: Introducing the Social CRM House. J. Interact. Mark. 2013, 27, 270–280. [Google Scholar] [CrossRef]
Dey, L.; Haque, S.M.; Khurdiya, A.; Shroff, G. Acquiring Competitive Intelligence from Social Media. In Proceedings of the 2011 Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data, Beijing, China, 17 September 2011; ACM: New York, NY, USA, 2011; pp. 3:1–3:9. [Google Scholar] [CrossRef]
Governatori, G.; Iannella, R. A modelling and reasoning framework for social networks policies. Enterp. Inf. Syst. 2011, 5, 145–167. [Google Scholar] [CrossRef]
Pang, B.; Lee, L. Opinion Mining and Sentiment Analysis. Found. Trends Inf. Retr. 2008, 2, 1–135. [Google Scholar] [CrossRef]
Pak, A.; Paroubek, P. Twitter as a Corpus for Sentiment Analysis and Opinion Mining. In Proceedings of the International Conference on Language Resources and Evaluation, LREC 2010, Valletta, Malta, 17–23 May 2010. [Google Scholar]
Hutto, C.J.; Gilbert, E. VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text; ICWSM; Adar, E., Resnick, P., Choudhury, M.D., Hogan, B., Oh, A.H., Eds.; The AAAI Press: Palo Alto, CA, USA, 2014; Available online: http://www.aaai.org/ocs/index.php/ICWSM/ICWSM14/paper/view/8109 (accessed on 29 November 2017).
Cambria, E.; Schuller, B.; Xia, Y.; Havasi, C. New Avenues in Opinion Mining and Sentiment Analysis. IEEE Intell. Syst. 2013, 28, 15–21. [Google Scholar] [CrossRef]
Damian, R.I.; Sherman, J.W. A process-dissociation examination of the cognitive processes underlying unconscious thought. J. Exp. Soc. Psychol. 2013, 49, 228–237. [Google Scholar] [CrossRef]
Huizenga, H.M.; Wetzels, R.; van Ravenzwaaij, D.; Wagenmakers, E.J. Four empirical tests of Unconscious Thought Theory. Organ. Behav. Hum. Decis. Process. 2012, 117, 332–340. [Google Scholar] [CrossRef]
Dijksterhuis, A.; Nordgren, L.F. A Theory of Unconscious Thought. Perspect. Psychol. Sci. 2006, 1, 95–109. [Google Scholar] [CrossRef] [PubMed]
Simonson, I. In Defense of Consciousness: The Role of Conscious and Unconscious Inputs in Consumer Choice. J. Consum. Psychol. 2005, 15, 211–217. [Google Scholar] [CrossRef]
Lindstrom, M. Broad sensory branding. J. Prod. Brand Manag. 2005, 14, 84–87. [Google Scholar] [CrossRef]
Hultén, B. Sensory marketing: The multi-sensory brand—Experience concept. Eur. Bus. Rev. 2011, 23, 256–273. [Google Scholar] [CrossRef]
Krishna, A. Sensory Marketing: Research on the Sensuality of Products; Taylor & Francis: Abingdon, UK, 2011. [Google Scholar]
Krishna, A. An integrative review of sensory marketing: Engaging the senses to affect perception, judgment and behavior. J. Consum. Psychol. 2012, 22, 332–351. [Google Scholar] [CrossRef]
Schmitt, B. Experiential Marketing: How to Get Customers to Sense, Feel, Think, Act, Relate; Free Press: New York, NY, USA, 2000. [Google Scholar]
Huang, X.I.; Zhang, M.; Hui, M.K.; Wyer, R.S. Warmth and conformity: The effects of ambient temperature on product preferences and financial decisions. J. Consum. Psychol. 2014, 24, 241–250. [Google Scholar] [CrossRef]
Pastra, K.; Balta, E.; Dimitrakis, P.; Karakatsiotis, G. Embodied Language Processing: A New Generation of Language Technology. Available online: https://www.aaai.org/ocs/index.php/WS/AAAIW11/paper/viewFile/4003/4293 (accessed on 24 November 2017).
Lee, J.; Kim, K.S.; Kwon, Y.; Ogawa, H. Understanding Human Perceptual Experience in Unstructured Data on the Web. In Proceedings of the International Conference on Web Intelligence, WI ’17, Leipzig, Germany, 23–26 August 2017; ACM: New York, NY, USA, 2017; pp. 491–498. [Google Scholar] [CrossRef]
Miller, G.A. WordNet: A Lexical Database for English. Commun. ACM 1995, 38, 39–41. [Google Scholar] [CrossRef]
Adedoyin-Olowe, M.; Gaber, M.M.; Stahl, F.T. A Survey of Data Mining Techniques for Social Media Analysis. J. Data Min. Digit. Hum. 2014, arXiv:1312.46172014. [Google Scholar]
Liu, B.; Zhang, L. A Survey of Opinion Mining and Sentiment Analysis. In Mining Text Data; Springer: Boston, MA, USA, 2012; pp. 415–463. [Google Scholar]
Turney, P.D. Thumbs up or Thumbs down: Semantic Orientation Applied to Unsupervised Classification of Reviews. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL ’02, Philadelphia, PA, USA, 7–12 July 2002; Association for Computational Linguistics: Stroudsburg, PA, USA, 2002; pp. 417–424. [Google Scholar] [CrossRef]
Pang, B.; Lee, L.; Vaithyanathan, S. Thumbs up: Sentiment Classification Using Machine Learning Techniques. In Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing—Volume 10, EMNLP ’02, Philadelphia, PA, USA, 6–7 July 2002; Association for Computational Linguistics: Stroudsburg, PA, USA, 2002; pp. 79–86. [Google Scholar] [CrossRef]
Cheng, A.J.; Chen, Y.Y.; Huang, Y.T.; Hsu, W.H.; Liao, H.Y.M. Personalized Travel Recommendation by Mining People Attributes from Community-contributed Photos. In Proceedings of the 19th ACM International Conference on Multimedia, MM ’11, Scottsdale, AZ, USA, 28 November–1 December 2011; ACM: New York, NY, USA, 2011; pp. 83–92. [Google Scholar] [CrossRef]
Guy, I.; Zwerdling, N.; Ronen, I.; Carmel, D.; Uziel, E. Social Media Recommendation Based on People and Tags. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’10, Geneva, Switzerland, 19–23 July 2010; ACM: New York, NY, USA, 2010; pp. 194–201. [Google Scholar] [CrossRef]
Hermida, A.; Fletcher, F.; Korell, D.; Logan, D. Share, like, recommend. J. Stud. 2012, 13, 815–824. [Google Scholar] [CrossRef]
Bollen, J.; Mao, H.; Zeng, X. Twitter mood predicts the stock market. J. Comput. Sci. 2010, 2, 1–8. [Google Scholar] [CrossRef]
Nguyen, T.H.; Shirai, K.; Velcin, J. Sentiment analysis on social media for stock movement prediction. Expert Syst. Appl. 2015, 42, 9603–9611. [Google Scholar] [CrossRef]
Si, J.; Mukherjee, A.; Liu, B.; Li, Q.; Li, H.; Deng, X. Exploiting Topic based Twitter Sentiment for Stock Prediction. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013), Sofia, Bulgaria, 4–9 August 2013; Volume 2: Short Papers. pp. 24–29. Available online: http://aclweb.org/anthology/P/P13/P13-2005.pdf (accessed on 23 November 2017).
Watts, D.; George, K.M.; Kumar, T.A.; Arora, Z. Tweet sentiment as proxy for political campaign momentum. In Proceedings of the 2016 IEEE International Conference on Big Data (BigData 2016), Washington, DC, USA, 5–8 December 2016; pp. 2475–2484. [Google Scholar] [CrossRef]
Shirky, C. The Political Power of Social Media: Technology, the Public Sphere, and Political Change. Foreign Aff. 2011, 90, 28–41. [Google Scholar]
Ratkiewicz, J.; Conover, M.; Meiss, M.; Goncalves, B.; Flammini, A.; Menczer, F. Detecting and Tracking Political Abuse in Social Media. In Proceedings of the International AAAI Conference on Web and Social Media, Barcelona, Spain, 17–21 July 2011. [Google Scholar]
De Vries, L.; Gensler, S.; Leeflang, P.S. Popularity of Brand Posts on Brand Fan Pages: An Investigation of the Effects of Social Media Marketing. J. Interact. Mark. 2012, 26, 83–91. [Google Scholar] [CrossRef]
Hollebeek, L.D.; Glynn, M.S.; Brodie, R.J. Consumer Brand Engagement in Social Media: Conceptualization, Scale Development and Validation. J. Interact. Mark. 2014, 28, 149–165. [Google Scholar] [CrossRef] [Green Version]
Jussila, J.J.; Kärkkäinen, H.; Aramo-Immonen, H. Social Media Utilization in Business-to-business Relationships of Technology Industry Firms. Comput. Hum. Behav. 2014, 30, 606–613. [Google Scholar] [CrossRef]
Baird, C.H.; Parasnis, G. From social media to social customer relationship management. Strategy Leadersh. 2011, 39, 30–37. [Google Scholar] [CrossRef]
Tan, S.; Wang, Y.; Cheng, X. Combining Learn-based and Lexicon-based Techniques for Sentiment Detection without Using Labeled Examples. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’08, Singapore, 20–24 July 2008; ACM: New York, NY, USA, 2008; pp. 743–744. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Tang, D.; Qin, B.; Liu, T. Document Modeling with Gated Recurrent Neural Network for Sentiment Classification. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 17–21 September 2015; Association for Computational Linguistics: Lisbon, Portugal, 2015; pp. 1422–1432. Available online: http://aclweb.org/anthology/D15-1167 (accessed on 23 November 2017).
Kim, Y. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; Association for Computational Linguistics: Doha, Qatar, 2014; pp. 1746–1751. [Google Scholar]
Poria, S.; Cambria, E.; Gelbukh, A. Deep Convolutional Neural Network Textual Features and Multiple Kernel Learning for Utterance-Level Multimodal Sentiment Analysis. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 17–21 September 2015; Association for Computational Linguistics: Lisbon, Portugal, 2015; pp. 2539–2544. Available online: http://aclweb.org/anthology/D15-1303 (accessed on 23 November 2017).
Lai, S.; Xu, L.; Liu, K.; Zhao, J. Recurrent Convolutional Neural Networks for Text Classification. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, AAAI’15, Austin, TX, USA, 25–30 January 2015; AAAI Press: Palo Alto, CA, USA, 2015; pp. 2267–2273. [Google Scholar]
Navigli, R. Word Sense Disambiguation: A Survey. ACM Comput. Surv. 2009, 41, 10:1–10:69. [Google Scholar] [CrossRef]
Shutova, E. Automatic Metaphor Interpretation as a Paraphrasing Task. In Proceedings of the Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, HLT ’10, Los Angeles, CA, USA, 1–6 June 2010; Association for Computational Linguistics: Stroudsburg, PA, USA, 2010; pp. 1029–1037. [Google Scholar]
Yarowsky, D. Unsupervised Word Sense Disambiguation Rivaling Supervised Methods. In Proceedings of the 33rd Annual Meeting on Association for Computational Linguistics, ACL ’95, Cambridge, MA, USA, 26–30 June 1995; Association for Computational Linguistics: Stroudsburg, PA, USA, 1995; pp. 189–196. [Google Scholar]
Banerjee, S.; Pedersen, T. An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet. Computational Linguistics and Intelligent Text Processing; Gelbukh, A., Ed.; Springer: Berlin/Heidelberg, Germany, 2002; pp. 136–145. [Google Scholar]
Palanisamy, P.; Yadav, V.; Elchuri, H. Serendio: Simple and Practical Lexicon Based Approach to Sentiment Analysis. In Proceedings of the Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), Atlanta, GA, USA, 14–15 June 2013; Association for Computational Linguistics: Atlanta, GA, USA, 2013; pp. 543–548. Available online: http://www.aclweb.org/anthology/S13-2091 (accessed on 13 November 2017).
Melville, P.; Gryc, W.; Lawrence, R.D. Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’09, Paris, France, 28 June–1 July 2009; ACM: New York, NY, USA, 2009; pp. 1275–1284. [Google Scholar] [CrossRef]
Taboada, M.; Brooke, J.; Tofiloski, M.; Voll, K.; Stede, M. Lexicon-based Methods for Sentiment Analysis. Comput. Linguist. 2011, 37, 267–307. [Google Scholar] [CrossRef]
Baccianella, S.; Esuli, A.; Sebastiani, F. SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining; LREC; Calzolari, N., Choukri, K., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S., Rosner, M., Tapias, D., Eds.; European Language Resources Association: Paris, France, 2010; Available online: http://nmis.isti.cnr.it/sebastiani/Publications/LREC10.pdf (accessed on 13 November 2017).
Agerri, R.; García-Serrano, A. Q-WordNet: Extracting Polarity from WordNet Senses. In Proceedings of the International Conference on Language Resources and Evaluation, LREC 2010, Valletta, Malta, 17–23 May 2010; Available online: http://www.lrec-conf.org/proceedings/lrec2010/pdf/695_Paper.pdf (accessed on 13 November 2017).
Strapparava, C.; Valitutti, A. WordNet-Affect: An Affective Extension of WordNet. In Proceedings of the 4th International Conference on Language Resources and Evaluation, ELRA, Lisbon, Portugal, 26–28 May 2004; pp. 1083–1086. Available online: http://www.lrec-conf.org/proceedings/lrec2004/369.pdf (accessed on 3 October 2017).
Kounios, J.; Fleck, J.I.; Green, D.L.; Payne, L.; Stevenson, J.L.; Bowden, E.M.; Jung-Beeman, M. The origins of insight in resting-state brain activity. Neuropsychologia 2008, 46, 281–291. [Google Scholar] [CrossRef] [PubMed]
Spence, C.; Shankar, M.U. The Influence of Auditory Cues on The Perception of, and Responses to, Food and Drink. J. Sens. Stud. 2010, 25, 406–430. [Google Scholar] [CrossRef]
Gimpel, K.; Schneider, N.; O’Connor, B.; Das, D.; Mills, D.; Eisenstein, J.; Heilman, M.; Yogatama, D.; Flanigan, J.; Smith, N.A. Part-of-speech Tagging for Twitter: Annotation, Features, and Experiments. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Short Papers—Volume 2, HLT ’11, Portland, OR, USA, 19–24 June 2011; Association for Computational Linguistics: Stroudsburg, PA, USA, 2011; pp. 42–47. Available online: http://dl.acm.org/citation.cfm?id=2002736.2002747 (accessed on 2 November 2017).
Lesk, M. Automatic Sense Disambiguation Using Machine Readable Dictionaries: How to Tell a Pine Cone from an Ice Cream Cone. In Proceedings of the 5th Annual International Conference on Systems Documentation, SIGDOC ’86, Toronto, ON, Canada, 1 June 1986; ACM: New York, NY, USA, 1986; pp. 24–26. [Google Scholar] [CrossRef]
Luhn, H.P. A Statistical Approach to Mechanized Encoding and Searching of Literary Information. IBM J. Res. Dev. 1957, 1, 309–317. [Google Scholar] [CrossRef]
Yu, H.; Han, J.; Chang, K.C.C. PEBL: Positive Example Based Learning for Web Page Classification Using SVM. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’02, Edmonton, AB, Canada, 23–25 July 2002; ACM: New York, NY, USA, 2002; pp. 239–248. [Google Scholar] [CrossRef]
John, G.H.; Langley, P. Estimating Continuous Distributions in Bayesian Classifiers. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, UAI’95, Montreal, QC, Canada, 18–20 August 1995; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 1995; pp. 338–345. [Google Scholar]
Quinlan, J.R. C4.5: Programs for Machine Learning; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 1993. [Google Scholar]
Ho, T.K. Random Decision Forests. In Proceedings of the Third International Conference on Document Analysis and Recognition (ICDAR ’95), Montreal, QC, Canada, 14–16 August 1995; IEEE Computer Society: Washington, DC, USA, 1995; Volume 1, pp. 278–282. [Google Scholar]
Hall, M.; Frank, E.; Holmes, G.; Pfahringer, B.; Reutemann, P.; Witten, I.H. The WEKA Data Mining Software: An Update. SIGKDD Explor. Newsl. 2009, 11, 10–18. [Google Scholar] [CrossRef]
Cybenko, G. Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst. 1989, 2, 303–314. [Google Scholar] [CrossRef]
Chang, C.C.; Lin, C.J. LIBSVM: A Library for Support Vector Machines. ACM Trans. Intell. Syst. Technol. 2011, 2, 27:1–27:27. [Google Scholar] [CrossRef]
Hilton, K. Psychology the science of sensory marketing. Harv. Bus. Rev. 2015, 3, 28–31. [Google Scholar]

Figure 1. Human Perceptual Experiences.

Figure 2. Overall Tasks of Sensation Classification with Spatial Footprint.

Figure 3. Comparison of Sensation Intensity Ratio with Suvey Result.

Figure 4. Normalized Sensation Intensity by Weight Functions.

Figure 5. Sensation Intensity of Major Geo-spatial Groups.

Figure 6. Analysis of Geo-spatial Sensation Globally.

Figure 7. The Comparison between The Average Temperatures and Touch Sensation Intensity for Every State in The United States during Summer Season of 2010 to 2014.

Figure 8. The Usage Statistics of Twitter in United States during Summer Season of 2010 to 2014.

Table 1. Selected semantic relation from WordNet synset.

Part-of-Speech	Selected Semantic Relation
Noun, Verb	{direct hypernym, sister term}
Adjective	{see also, similar term}
Adverb	{synonyms}

Table 2. Comparison with results from sensation and word features.

Features		Classifier	F1 Measure	Accuracy
		Random Forest	60.7	61.24
Word		SVM	64.4	65.68
feature		MLP	68.2	69.88
		CNN	79.1	79.33
		RCNN	80.1	80.39
		Naive Bayes	61.2	62.42
Sensation	Weight	C4.5	68.9	68.97
feature	(4)	Random Forest	78.4	78.56
		SVM	80.1	80.24

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lee, J.; Ogawa, H.; Kwon, Y.; Kim, K.-S. Spatial Footprints of Human Perceptual Experience in Geo-Social Media. ISPRS Int. J. Geo-Inf. 2018, 7, 71. https://doi.org/10.3390/ijgi7020071

AMA Style

Lee J, Ogawa H, Kwon Y, Kim K-S. Spatial Footprints of Human Perceptual Experience in Geo-Social Media. ISPRS International Journal of Geo-Information. 2018; 7(2):71. https://doi.org/10.3390/ijgi7020071

Chicago/Turabian Style

Lee, Jun, Hirotaka Ogawa, YongJin Kwon, and Kyoung-Sook Kim. 2018. "Spatial Footprints of Human Perceptual Experience in Geo-Social Media" ISPRS International Journal of Geo-Information 7, no. 2: 71. https://doi.org/10.3390/ijgi7020071

APA Style

Lee, J., Ogawa, H., Kwon, Y., & Kim, K. -S. (2018). Spatial Footprints of Human Perceptual Experience in Geo-Social Media. ISPRS International Journal of Geo-Information, 7(2), 71. https://doi.org/10.3390/ijgi7020071

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Spatial Footprints of Human Perceptual Experience in Geo-Social Media

Abstract

1. Introduction

2. Related Work

3. Methodology

3.1. Lexicon Resources

3.2. Sensation Feature Extraction

3.2.1. Geo-Partitioning

3.2.2. Tagging

3.2.3. Enriching

3.2.4. Scoring

3.3. Sensation Classification

4. Experiments and Evaluations

4.1. Dataset

4.2. Classification Results

4.3. Geo-Spatial Analysis of Sensation Intensity

5. Conclusions and Future Work

Acknowledgments

Author Contributions

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI