| A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 4/24/2013 | Title | Abstract | Link to presentation | Link to poster | Final report submitted | ||||||||||||||
2 | Vagelis Papalexakis + others | Recommendations using Coupled Matrix Factorization | In this project we propose to enhance user rating prediction and recommendation engine using information collected from other popular websites. Consider the example of Netflix, where users watch movies and then rate them on a scale of 1-5. Suppose that we store this rating information on a matrix X, whose rows correspond to all movies that Netflix provides, and the columns correspond to users. By filling in the missing values of this matrix, we are able to provide recommendations to users, provided that the predicted rating is sufficiently high. We enhance the prediction and recommendation engine using additional pieces of information built upon the same set of movies, but coming from other sources, e.g. IMDB and let’s call Y the matrix that stores this information. One could imagine a multitude of possible features that could be extracted from IMDB and used as columns of Y. We propose to jointly factorize X and Y into a set of low rank components, such that an appropriate loss function is minimized. | http://goo.gl/nkuvG | https://www.dropbox.com/s/634nnzydx7l961g/poster.pdf | yes | ||||||||||||||
3 | Victor Hwang, Jen King, Bryan Hood | Yelp Me - Inferring Future Business Attention | Yelp collects many different forms of data about a particular business - user reviews, ratings, location and more. However, it would be valuable to businesses to get a glimpse of their future Yelp reviews - how many reviews will they get in the next month? Will their average rating improve? In this paper, we describe methods for inferring future business attention using regression models and sentiment analysis of reviews. | https://www.dropbox.com/s/s53lh1fpdgyp68d/yelp_me_first_day_2nd_presentation.pdf | yes | |||||||||||||||
4 | Sam Gruber (scgruber), Adam Mihalcin (amihalci), and Prashant Sridhar (psridha1) | Automatically Inferring Tag Hierarchies in Social Tagging Systems | We propose new models to induce a hierarchy over a set of tags in a social tagging system, also known as a folksonomy. We compare hierarchical clustering based solely on the tag descriptions with a method that first classifies tags into classes and then performs hierarchical clustering on the classes, and finally a method that looks at each tag as a set of posts, and considers two tags to be linked in a parent-child relationship if one tag’s set of posts is close to a subset of the other’s set of posts. | http://www.andrew.cmu.edu/user/amihalci/tag-hierarchy.pdf | www.andrew.cmu.edu/~amihalci/tag-hierarchy-poster.pdf | yes | ||||||||||||||
5 | Matt Mukerjee & David Naylor | Improving Netflix CDN Performance with Smarter Prefetching | Netflix distributes free ``Open Connect'' boxes to ISPs which collectively function as a Netflix-only content delivery network (CDN). Each Open Connect box includes many hard disks (HDDs) and a layer of caching solid state drives (SSDs). Despite being equipped with multiple 10Gbps Ethernet cards, these boxes offer a surprisingly low 8Gbps peak throughput. A large part of the problem is the sheer number of users making these requests in parallel causes thousands of individual sequential reads to become effectively a huge collection of random reads, destroying disk performance. We conduct a trace-driven study to demonstrate the viability of providingbetter throughput with smarter prefetching. We seek to make informed decisionsabout what data should be prefetched and stored in the SSD cache at any giventime by analyzing data provided by Conviva, a CDN that services online videodelivery for companies including HBO and ESPN. Specifically, for a given user U and video V, we seek to estimate how much of Vuser U will watch (will they give up after 10 seconds, or watch until theend?). As most users will not have watched most videos, we need to deal withmissing data; we use collaborative filtering to fill in these blanks withpredicted view times (akin to a recommender system filling in missing ratings).We enhance our scheme further with community modeling: we group users based ontheir region to find communities of members within a given region, providingtailored prefetching recommendations to individual Open Connect nodes. | https://www.dropbox.com/s/rhgpd8rzd5ili5k/mukerjee-naylor-presentation.pdf | https://www.dropbox.com/s/95pqookal62g5z7/mukerjee-naylor-poster.pdf | yes | ||||||||||||||
6 | Sam Zhang (xiuyuanz), Jerry Vinokurov (jerryv ) | Machine learning on predicting NBA games against handicap bet | Betting on sports is a big business. The way betting on scoring sports such as basketball generally happens is that the bookmaker sets a “line,” which indicates which team is favored and by how much. Hence, much more complicated than guessing which team is going to win. But with the abundance of analytic data available online, our team set the goal to beat the odds given by the public handicapper. In our project, we collected NBA analytics from the internet within a year record, processed the data using PCA and clustering, and applied a Naive Bayes classifier on the processed data to predict whether the score offset provided by the handicapper is low or high. | https://docs.google.com/file/d/0B314Jg1QgZGwblpEXzdMUTYzRzA/edit?usp=sharing | https://docs.google.com/file/d/0B314Jg1QgZGwT29wZ2FUMnhmN00/edit?usp=sharing | yes | ||||||||||||||
7 | Matineh Eybpoosh, Jingkun Gao, Leneve Ong | Robust Damage Detection of Civil Structures under Operational and Environmental Variations | The near real-time measurement data from the mechanical and structural systems has widely been used to benefit Structural Health Monitoring (SHM) in detecting faulty operations and damages in more efficient manners compared to traditional technologies. However, the majority of these approaches has been developed under controlled laboratory or simulated conditions, either ignoring the effects of real-world environmental and operational conditions (EOC) in their analyses, or being based on the controlled and known impacts of EOCs. Most of these methods are based on detection of shifts in the resonant frequencies or mods of structures, assuming that damages are the only factors leading to such shifts, or such shifts can be easily differentiable from those caused by EOCs. However, as a number of studies have proved, the effects of such EOCs can blur the effects of damages in the collected data, or they can lead to undesirable conclusions about existence/location of the damages. The ultimate goal of this project is to come up with methods that would lead us to structural damage detection of water supply pipes operating under varying environmental and operational conditions (EOC). This would be achieved by identification of damage-sensitive, EOC-insensitive, features, and application of appropriate machine learning techniques for classification of different states of the pipes. | https://www.dropbox.com/s/ojyg4lk6u56qmwq/MatinehEybpoosh_Presentation.pptx | https://www.dropbox.com/s/45jeqpusfk1oeij/poster.pdf | yes | ||||||||||||||
8 | Weinan Yang, Xiaoyu Cui, Mohammad Mehdi Niki Rashidi, Zihao Wang | The analysis of the relationship between the feature selection process PCA and the bridge vibration theory | In this report, we present the mathematical proof of the close relationship between the machine learning technique PCA and the bridge vibration theory and the feasibility of using these features to conduct classification based on a current ongoing research, “Indirect Bridge Structural Health Monitoring” in Civil and Environmental Engineering. By seeking the mathematical connection between structural analysis and machine learning techniques, we are able to use PCA to select features among the signals collected from sensors mounted on a travelling car model on a laboratory bridge model during an experiment. The SVM classifier is used to classify the signals from different ‘damage’ scenarios defined by conditions in which different amount of masses are put on the bridge. The result shows that PCA is a valid feature selection method and the SVM classifier achieves high accuracy for this laboratory experiment. | https://www.dropbox.com/s/42cifpoj4wwfyko/10701.pdf?n=99619873 | https://www.dropbox.com/l/R9mamNNwLDyt4qj77MWrDa | yes | ||||||||||||||
9 | Manzil Zaheer, Milad Memarzadeh, Suman Giri | Unsupervised Disaggregation of Power Consumption data | The problem of estimating appliance level power consumption based on aggregate level power information at the main circuit has been studied well. Both supervised and unsupervised algorithms have been proposed to solve this problem. In this project, we use the Additive Factorial Hidden Markov Model formulation proposed by Kolter and Jaakkola to disaggregate total power consumption of a building into appliance level constituents. For the purposes of this project, we are working with simulated data for five appliances for a week assuming 1 Hz sampling rate. Although, the problem is intractable in theory, we use inference algorithms proposed in literature to estimate the contribution of different sources in a reasonable amount of time. We discuss our results and propose further steps for implementing this in a real dataset. | |||||||||||||||||
10 | Ruikun Luo, Fanyi Xiao. | Multi-Task Regularization with Covariance Dictionary for Linear Classifiers | In this paper we propose a multi-task linear classifier learning problem called D-SVM (Dictionary SVM). D-SVM uses a dictionary of parameter covariance shared by all tasks to do multi-task knowledge transfer among different tasks. We formally define the learning problem of D-SVM and show two interpretations of this problem, from both the probabilistic and kernel perspectives. From the probabilistic perspective, we show that our learning formulation is actually a MAP estimation on all optimization variables. We also show its equivalence to a multiple kernel learning problem in which one is trying to find a re-weighting kernel for features from a dictionary of basis (despite the fact that only linear classifiers are learned). Finally, we describe an alternative optimization scheme to minimize the objective function and present empirical studies to valid our algorithm. | https://www.dropbox.com/s/acvybsja3tw2m8f/Fanyi%20Xiao_Ruikun%20Luo.pdf | https://www.dropbox.com/s/gzi6c0770gabqqw/ml_poster_fyx_lrk1.pdf | yes | ||||||||||||||
11 | Zhuo Chen and Brandon Taylor (must present 4/24) | ISIS Passive RADAR Event Classification | The ISIS Array (Intercepted Signals for Ionospheric Science) is an instrumentation project designed to construct a coherent software radio network capable of operating as a flexible multi-role distributed radio science instrument. We are examining passive RADAR data from the ISIS array to detect events of interest to ionospheric scientists. Since these RADARs use commercial radio broadcasts as a signal, they are far less costly than active RADARs to operate and can thus be used to collect vast quantities of data. Our goal is to develop an algorithm for automatically searching the RADAR data and detecting potentially interesting events (e.g. meteors, ionospheric weather events) and data artifacts. | https://www.dropbox.com/s/dq0ccgnku71olnk/TaylorZhuo_ML10-701.pdf | https://www.dropbox.com/s/gs36ebrv62rmt2t/radar-poster.pdf | |||||||||||||||
12 | Xiaolong Shen, Shuchang Liu, Zhu Meng | Inferring Gene Regulatory Network with Prior Knowledge | A gene regulatory network usually refers the ensemble of DNA segments , proteins in a cell and the causal schemes according to these elements interacting with each other and with other substances in the cell[1]. As previous study demonstrated, the variation of regulator, especially transcript factor, can account for the observed expression differences. There are many studies discussing the correlation between transcript factors and gene expressions and many interesting algorithms are dealing with this problem, like the hierarchical Bayesian graphical Gaussian model and Bayesian Networks, the Dantzig Selector, the Lasso and Their Meta-Analysis. But seldom are the studies of regulatory network considering the interactions between both regulatory motifs and genes like signaling pathways at the same time. So we start with building the network model of the regulatory process and trying to both include the impact of regulator on gene and the influence of gene expression on regulator which both were proved existed. We treat regulators and expression level of genes as input and output respectively, which are both dependent variables. We try to set up the model to learn the time series gene expression data, and also test how different biological prio knowledge assist the model learning. | https://docs.google.com/file/d/0BzKBY5I3SKcMTHdIVHBKNE5pdm8/edit?usp=sharing | https://docs.google.com/file/d/0BzKBY5I3SKcMRm5ZQmJMSnVQWUE/edit?usp=sharing | yes | ||||||||||||||
13 | Boyuan Li | Point clouds object recognition based on Support Vector Machine | With the advent of new, low-cost 3D sensing hardware such as the Kinect, and consistent research on advanced point cloud processing, 3D perception becomes important and valuable in robotics, building and other fields. In this paper, a support vector machine (SVM) based machine learning framework for objects recognition of RGBD point cloud data from Microsoft Kinect sensor is developed. With this framework, equipment with Kinect will take the label of objects as input and export the most likely object after scanning. As a demonstration of the algorithm, the office workplace (Intelligent Workplace, MMCH, CMU) is used as the point cloud test bed. From the collected point cloud data, pre-processing including registration and 3D image segmentation is conducted. This framework has been tested on a database of 9 selected objects of 12 point clouds. | https://dl.dropboxusercontent.com/u/7009677/Boyuan_Li_10701%20(1).pdf | https://dl.dropboxusercontent.com/u/7009677/poster.pdf | yes | ||||||||||||||
14 | Siheng Chen, Tianhao Tang, Xin Wang | Clustering-based Segmentation for Histology Images | The task is to segment the histology images. The images contain several common tissues, including bone, fat, cartilage and so on. Our segmentation system consists of image preprocessing, feature extraction and segmentation parts. In the preprocessing stage, super-pixels techniques help to capture local shape and intensity information and then group the neighbor pixels into super-pixels. In the feature extraction stage, histopathology vocabulary is used to mimic the visual cues used by experts. In the segmentation stage, we focus on clustering-based methods and assign the super-pixels in the same clusters with the same labels. The plan is to start from k-means and then explore spectral clustering and subspace clustering techniques. We also want to add the spatial constraints that neighborsuper-pixels have more possibilities to share the same labels. We will use several methods to evaluate the performance of different methods. | http://phoroneus.net/siheng/701_project.pdf | https://dl.dropboxusercontent.com/u/109416725/Poster.pdf | yes | ||||||||||||||
15 | 4/29/2013 | |||||||||||||||||||
16 | Nick Rhinehart, Gaurav Singh, Bhaskar Vaidya | Semi-supervised Inference Machines | We propose to leverage developments in the area of unsupervised image feature dictionary learning (Bo et. al) to augment a state-of-the-art semantic labeling approach (Munoz et al.). Munoz's approach creates an inference machine based on random forest training and stacking, which is used to infer probability distributions of the semantic labels of novel images. However, this approach currently requires hand-labeled training data, significantly limiting the feasible amount of training data that can be utilized. With the inclusion of a feature that is learned without labeled training data, we seek to improve the overall performance of the inference machine. Moreover, by removing the mandatory labeling procedure, training datasets can be quickly augmented with novel images. | https://www.dropbox.com/s/xuxgbsw9r63emgx/FINAL_ML_PRESENTATION.pdf | https://www.dropbox.com/s/a4sfgi0noz2fuf1/poster_slide_real_larger_title.pdf | yes | ||||||||||||||
17 | Yuxiong Wang, Junyan Zhu | Kernelized Self-explanatory Sparse Representation for Image Classification | Sparse representation and kernel methods are both powerful tools to discover hidden structure of complex data. In this project, we try to apply the idea of kernel methods to extend two key steps in sparse representation including dictionary learning and sparse coding to reproduce kernel Hilbert spaces. Particularly, a self-explanatory reformulation of sparse representation is proposed at first where the basis vectors are substituted by a linear combination of features. This scheme is then generalized into an arbitrary kernel space with computational tractability and conceptual interpretability. Block-wise coordinate descent and Lagrange multipliers are proposed to solve the corresponding sparse coding sub-problem and dictionary learning sub-problem, respectively. Image classification is carried out to evaluate the performance of our sparse representation algorithm in the kernel space. Experimental results on two benchmarks (Caltech-256 and Scene 15) show the effectiveness of our algorithm. | https://docs.google.com/file/d/0BzbsMcQzyu8lb0NYQkktOXBIYWM/edit?usp=sharing | https://docs.google.com/file/d/0BzbsMcQzyu8lWkZnc0FKY1BuM1E/edit?usp=sharing | yes | ||||||||||||||
18 | Divya Hariharan (dharihar), Hariharasudan Malaichamee (hmalaich), Premkumar Natarajan (premkumn), Wassim Ferose (whabeebr) | Objectness Measure of ROIs in Images | Given a Region of Interest in an image, we wish to assign a score to represent how likely it is for the given region to have an object of a particular class. By objects, we mean stand-alone things with a well-defined boundary like a vehicle, human or animals and not amorphous background regions like grass or sky. The major challenge in this problem is that the state-of-the-art object detectors pertain to one class of objects and we want to make it more generic. Proper selection of features and classifiers (and possibly a combination of a few) are important for this problem to achieve good results. | https://www.dropbox.com/s/s0yupk6t8pymjm9/Presentation-Objectness_Measure.pdf | https://www.dropbox.com/s/wwdlogyxwnogup4/MLPoster-Objectness%20Measure-Final.pdf | yes | ||||||||||||||
19 | Daniel Maturana, David Fouhey, Jacob Walker | Semantic Segmentation with Discriminative Mid-level Patches | In this paper, we propose an approach for semantic segmentation that uses detections of visual concepts (or "patches") to model correlations between regions in images. We use a state-of-the-art unsupervised approach to discover a dictionary of visual concepts and detectors. Unlike previous low-level features which encode ubiquitous patterns such as right-angled junctions, these visual concepts are highly specific (e.g., wheels of cars, toilet bowls, etc.) and therefore induce meaningful correspondence when found. We explore methods to extract an accurate semantic segmentation from these correspondences. The first contribution is a novel kernel, which we use within a Gaussian process framework on the task of predicting per-pixel semantic labels. We also propose a method to refine the semantic consistency of the detected correspondences based on Markov Random Fields. Finally, we also implement standard bag-of-words image parsing approaches. We evaluate these approaches on a publicly available scene understanding dataset and show the advantages of our proposed methods. | https://www.dropbox.com/s/0nqedzuwadprl84/slides.pdf | https://www.dropbox.com/s/lvvbkbu20t5jpz8/poster.pdf | yes | ||||||||||||||
20 | Yipei Wang must present April 29 | Semi-automatic audio semantic concept discovery for video retrieval | This paper presents a semi-automatic hierarchical method to discover new audio semantic concept (sound event) for multimedia content analysis. Our work aims to reduce the human effort in defining and labeling effective audio semantic concepts for multimedia retrieval and summarization. Our approach has three steps. To handle the varying distribution of audio signals, we first use k-means clustering to decompose an audio stream into audio words. For reasonable coverage of the real data, we adopt Latent Dirichlet Allocation (LDA) to find abstract descriptors in a collection of video clips. To map to human interpretable semantics, hierarchical clustering using the topic distribution on pre-annotated segments is adopted to discover sub-category concepts. | https://docs.google.com/file/d/0B4XMJzmuWwZ_bG4wTDZrTXFNYXc/edit | https://docs.google.com/file/d/0B4XMJzmuWwZ_Z0k1M0FBeGtRZ1U/edit | yes | ||||||||||||||
21 | Ishan Misra (imisra), Francisco Vicente (fvicente) and Wen-Sheng Chu (wschu) | Scalable Support Vector Machines and Its Applications to Computer Vision | Support Vector Machines (SVMs) are powerful tools in both classification and regression tasks. The computer vision community, in particular, relies heavily on SVMs for object detection [Felzenszwalb et al. 2010], image retrieval [Shrivastava et al. 2011], face image analysis [Kumar et al. 2009], etc. In thisproject we address the problem commonly faced while using SVMs on large datasets. Specifically, we focus on the scalabilities of Exemplar-SVMs and nonlinear SVMs with RBF, X^2 and intersection kernels. The effectiveness of our approaches are demonstrated on two popular computer vision problems, i.e., object detection and facial attribute recognition. | https://www.dropbox.com/s/w90n6w8w9rdrepd/scalable_svm_cv.pdf | yes | |||||||||||||||
22 | Xinghai Hu, Minghao Ruan, and Han Li | Combine cRBM and CNN for Object Recognition | We will present a novel deep network structure that combines the Convolutional Neural Network (CNN) and novel Convolutional Restricted Boltzmann Machine (cRBM). This combination can take advantage of the structural connections between stacked RBMs and Multi-layer Perceptrons (MLP). We will analyze the structure of both CNN and cRBM, and also analyze their similarities in structure. Besides, we will present with an essential component of a probabilistic max-pooling process, which enforces sparsity in cRBM’s output layer and ensures binary inputs to the next layer of network. Finally, we will show the superior performance of our framework on two standard datasets over current state-of-the-art methods. | https://www.dropbox.com/s/kwifysh8g47pnvc/ML-13-deelplearn.pdf?n=7833691 | https://www.dropbox.com/s/hspqmqlhnjrhd49/poster_v5.pdf | yes | ||||||||||||||
23 | Jiaji Zhou; Kumar Shaurya Shankar (must present 4/29) | Robust Iterative Policy Search by Dynamic Programming | In many real robotics examples where dynamics and sensor data are noisy, the robustness of control algorithm is essential. Additionally, most physical systems are high-dimensional and it is highly unlikely to have data for all parts of the state-space. Since it is unlikely for any formulated model to completely capture all of the underlying system dynamics, learning control algorithms should exhibit some degree of robustness to undermodeling. In this research project we intend to explore a hybrid approach to solving a control problem through a reinforcement learning approach; specifically, we propose to combine Iterative Linear Quadratic Regulator methods with Policy Search by Dynamic Programming. | https://www.dropbox.com/s/8f6fkt2qoku2mux/presentation.pdf | yes | |||||||||||||||
24 | Gaussian Process Based Filtering for Neural Decoding | Future prosthetic devices will depend on neural decoders to translate measured neural activity from the brain into usable control signals. This task is commonly framed as a filtering problem, where neural spike signals are observations of the hidden effector state. Many modern decoding algorithms use linear transition and observation models learned from data, but have poor performance when reconstructing neural signals from trajectories, suggesting a low fidelity model. Non-parametric function approximators, such as Gaussian Processes, overcome the difficulties in modeling and better represent nonlinearities, but have difficulty filtering in real-time due to high computational complexity. Various techniques for optimizing performance, such as fast approximate GP regression using KD-Trees and dimensionality reduction of observations, can serve to alleviate this. We propose a GP- Unscented Kalman Filter utilizing fast regression techniques for real-time filtering of neural signals. We evaluate the performance of our approach on a dataset taken from an electrode array implanted in the motor cortex of a rhesus monkey performing a structured cursor control task. Our results show that this framework provides accurate trajectory reconstruction as well as a better generative model of neural firing than current methods. | https://www.dropbox.com/s/ivzy8q32k5tyt5u/mac_Spotlight_karthik_humphrey_arun.pptx | https://www.dropbox.com/s/umr9ybbfr71eyzs/MLposter.pptx | yes | |||||||||||||||
25 | Adrian Trejo, Sunguk Choi | Latent Dirichlet Allocation on Biological Data | Pancreatic cancer has one of the highest fatalty rates among all cancers, with over 30,000 deaths per year. Biomarkers in patient samples are a promising diagnistic tool for identifying pancreatic cancer, but identifying patterns among different biomarkers is difficult. We apply Latent Dirichlet Allocation (LDA) to patient data to extract significant trends among different biomarkers. We conclude that LDA is a suitable clusterming method, generating different biomarker clusters for pancreatic cancer patients and control patients. | http://www.contrib.andrew.cmu.edu/~atrejo/lda.pdf | http://www.contrib.andrew.cmu.edu/~atrejo/poster.pdf | yes | ||||||||||||||
26 | Cemal Erdem, Qiangjian Xi and Rittika Shamsuddin | Predicting Human coding-DNA (cDNA) Sequences from Amino Acid Sequence | The central dogma of the molecular biology dictates that the flow of genetic information is transferred from genetic material DNA to the proteins. Back-translation, however, is the process of obtaining the DNA sequence given a sequence of amino acids. The uncertainty in this mapping, from proteins to DNA, is due to the degeneracy in the genetic code (codons): one amino acid is encoded by more than one codon, the nucleic acid triplets in the DNA. Thus, in this project first and second order HMM models are employed. Given a human protein sequence, most probable coding-DNA (cDNA) sequence is predicted. The accuracy of correctly encoded codons reaches up to 90% for first order HMM, and to 80% for the second order. Both algorithms have a mean accuracy of around 50%, where the nucleotide sequences are indeed predicted correctly up to 95%. Moreover, additional two here-in proposed variants of the Viterbi decoding are employed but no improvement is achieved in the first order HMM. Lastly, a structured SVM (sSVM) library is utilized and the mean accuracy obtained is 42%. Being the first paper to utilize a dataset of only individual gene-protein pairs of human, we so far showed that neither second order HMM nor the sSVM performs better than the simple first order HMM in back-translating amino acids to codons. | https://www.dropbox.com/s/agf6nrnhdqmbstk/BackTranslation_CQR.ppt | https://www.dropbox.com/s/z0tr8jz2gksrrh6/poster.pdf | yes | ||||||||||||||
27 | Xiao Wang (must present 4/29) | Room-Level Indoor Tracking Through Occupancy Inference | Pancreatic cancer has one of the highest fatalty rates among all cancers, with over 30,000 deaths per year. Biomarkers in patient samples are a promising diagnistic tool for identifying pancreatic cancer, but identifying patterns among different biomarkers is difficult. We apply Latent Dirichlet Allocation (LDA) to patient data to extract significant trends among different biomarkers. We conclude that LDA is a suitable clusterming method, generating different biomarker clusters for pancreatic cancer patients and control patients. | http://goo.gl/TIjP4 | http://goo.gl/tB62j | yes | ||||||||||||||
28 | William Wang, Troy Hua | Predicting Financial Volatility from Earnings Call with Semiparametric Gaussian Copulas | Earnings call summarizes the financial performance of a company in a period, and it is an important indicator of the future financial risks of the company. In this project, we quantitatively study how earnings calls are correlated with the measured volatility in the limited future. In particular, we seek to computationally model the language in earnings calls using copula-based models, which do not require any assumptions on the distribution of covariates and the dependent variable. This work improves previous studies in text regression by incorporating the dependency and interaction among local features in the form of elliptical copulas. | https://www.dropbox.com/s/payu304p2s70bwe/WangHua.pdf | https://www.dropbox.com/s/3dcfw7tvdtjb6nm/WangHua_Poster.pdf | yes | ||||||||||||||
29 | ||||||||||||||||||||
30 | ||||||||||||||||||||
31 | ||||||||||||||||||||
32 | ||||||||||||||||||||
33 | ||||||||||||||||||||
34 | ||||||||||||||||||||
35 | ||||||||||||||||||||
36 | ||||||||||||||||||||
37 | ||||||||||||||||||||
38 | ||||||||||||||||||||
39 | ||||||||||||||||||||
40 | ||||||||||||||||||||
41 | ||||||||||||||||||||
42 | ||||||||||||||||||||
43 | ||||||||||||||||||||
44 | ||||||||||||||||||||
45 | ||||||||||||||||||||
46 | ||||||||||||||||||||
47 | ||||||||||||||||||||
48 | ||||||||||||||||||||
49 | ||||||||||||||||||||
50 | ||||||||||||||||||||
51 | ||||||||||||||||||||
52 | ||||||||||||||||||||
53 | ||||||||||||||||||||
54 | ||||||||||||||||||||
55 | ||||||||||||||||||||
56 | ||||||||||||||||||||
57 | ||||||||||||||||||||
58 | ||||||||||||||||||||
59 | ||||||||||||||||||||
60 | ||||||||||||||||||||
61 | ||||||||||||||||||||
62 | ||||||||||||||||||||
63 | ||||||||||||||||||||
64 | ||||||||||||||||||||
65 | ||||||||||||||||||||
66 | ||||||||||||||||||||
67 | ||||||||||||||||||||
68 | ||||||||||||||||||||
69 | ||||||||||||||||||||
70 |