ABCDEFGHIJKLMNOPQRSTUVWXYZ
1
Question Report
2
Report Generated:
Jun 14, 2020 1:03 AM
3
TopicWebinar ID# QuestionActual Start Time
Actual Duration (minutes)
4
Artificial Intelligence and Deep learning Course by IIT Roorkee
962 3602 6138175Jun 13, 2020 7:33 PM217
5
Question Details
6
#QuestionAnswer(s)Asker Name
Asker Email
7
1are you going to cover multi task network or multi task loss ?I did not get your question ..could you please elaborate ..we will sover RMSE as cost function in this sessionArpit vijaywargiya
arpitvw16@gmail.com
8
2Why do we need Box plot when we have histogram for same visualisation?live answeredSugandhita
sugandhitap@gmail.com
Box plot is another way of looking at data especially on an interval scale
9
3
in object detection we have multi task network, which detect obect and give bounding box,
Arpit vijaywargiya
arpitvw16@gmail.com
10
4so there will be 2 losses one is for classification and other is for bounding boxArpit vijaywargiya
arpitvw16@gmail.com
11
5What information do we get out from skewness and modality?machine learning algorithms can not learn properly if the data is tail heavy or skwed…we will discuss the reason in Training models chapterSatyabrat Sabat
satya.jin@gmail.com
12
6Q2) generally we use Z-value (depends on median ) for P-value,
can we drive same thing with mode , as it is robust for outliers
Arpit vijaywargiya
arpitvw16@gmail.com
13
7what is the impact of non distributed feature on model creation?live answeredsudhir shetty
sudhir.m.shetty@gmail.com
14
8*(depends on mean)Arpit vijaywargiya
arpitvw16@gmail.com
15
9*drive things from MedianArpit vijaywargiya
arpitvw16@gmail.com
16
10
Hi, May I know which books we are referring please for current diagrams? This si something I would like to read in details too. I purchased O'reilly one on Hands-On Ml and great help connecting last two classed but these histograms etc not there.
These are the general concepts …we did not follow any book for the same.Rajiv
krajiv.2018@gmail.com
17
11How the Variance is diff from SDStandard deviation and variance both show spread. Standard deviation is the square root of varianceRajeev
rajeev213149@gmail.com
18
12error miimiseChinmay Athavale
chinmayat@gmail.com
19
13same unit as datasudhir shetty
sudhir.m.shetty@gmail.com
20
14otehr wise data will get canceledDr. santoshkumar
ksantosh.11@gmail.com
21
15beacause there is both + and - diff409992
supriya.bms@gmail.com
22
16because the difference can be negative and positiveShalini Gupta
gupt.shalu1993@gmail.com
23
17so that difference won't cancelkunal upadhyay
kupadhy@gmail.com
24
18ok, thank youRajiv
krajiv.2018@gmail.com
25
19What is standard deviation? What is the difference between standard deviation and variance?Standard deviation and variance both show spread. Standard deviation is the square root of variance
Anantpadmanabh Divanji
apgd14@gmail.com
26
20Squaring adds more weight to the larger differencsAnmol Khopade
anmolck@gmail.com
27
21This is important when points further than the mean are importantAnmol Khopade
anmolck@gmail.com
28
22what is the diffrence between bias and variance ?The bias error is an error from erroneous assumptions in the learning algorithm. High bias can cause an algorithm to miss the relevant relations between features and target outputs (underfitting). Whereas, the variance is an error from sensitivity to small fluctuations in the training set. High variance can cause an algorithm to model the random noise in the training data, rather than the intended outputs (overfitting).Vikas Bhartiya
ghivikas@gmail.com
29
23
Is mean and median being equal or closer to each other is the way of determining a distribution is normal or not?
Normal distributions are symmetric, unimodal, and asymptotic, and the mean, median, and mode are all equal.Puneet Rsstogi
puneetrstg@gmail.com
30
24Why Normal Distribution is imp?A lot of physical phenomenon follow normal distribution. We have built understanding and modelling capabilities for them.Rajeev
rajeev213149@gmail.com
31
25does variation of mode vs median tell us anythingA difference in mode and median typically means there is some skewing in the data. It calls for checking for outliers, long or fat tails, etcAnmol Khopade
anmolck@gmail.com
32
26how do we know if a dataset follows a normal distribution?For normal distribution the mean, median, mode are equal. To test for noramlity we can find the skewedness of the data.Nini Nursiah
nursiah.neelesh28@gmail.com
33
27what is z-value?A z-score gives you an idea of how far from the mean a data point is. Tt's a measure of how many standard deviations below or above the population mean a raw score is.VED
parmarvedpro5@gmail.com
34
28Why do we take root of it?By taking the root the unit of error is same as the observation
Senthilnathan Ramaswami
senthilnathan.ramaswami@servicenow.com
35
29When do we use RMSE and MSE?Hi Srini, this is a good question ..can you please post it on forum and we would explain it in detailSrini Boddu
siliconfish@yahoo.com
36
30what is the relation between variance and z-valuez = (x - mean)/ (standard deviation)VED
parmarvedpro5@gmail.com
37
31
how many lines will the algorithm try? As there are infinite way to assign w)0 and w1?
We will learn this in upcoming chapters while discussing gradient descentNini Nursiah
nursiah.neelesh28@gmail.com
38
32What is W0 and W1 here ?These are coefficientsNitin Nigam
nknigam@gmail.com
39
33For different lines drawing, are we using different data?No, the algorithm tries to find the line which best fits the available data. The best fitting line is the one with lowest RMSE.
Veeru(VeeraNancharaiah Javvaji)
jsriveeru@gmail.com
40
34I meant in which scenario, do you choose MSE and in which one you use RMSE?I think it is already answeredSrini Boddu
siliconfish@yahoo.com
41
35i didn’t get why pam score is better ?
Anonymous Attendee
42
36what does negative sign indicate in output of describe method ?jia sharma
jiavidhi.sharma@gmail.com
43
37what are the x axis and y axis in hist plotX is the value and Y axis has the frequency of the valueRajeev
rajeev213149@gmail.com
44
38More towards left - near 34Rajiv
krajiv.2018@gmail.com
45
39what does bins represent?Venkata Pradyumna
ivpradyumna@gmail.com
46
40Latitude bimodal and longitude bimodal—> LA and SF?Yes, more dense population in certain areas.
Domenico Fioravanti
nicodom@gmail.com
47
41riht skewedDr. santoshkumar
ksantosh.11@gmail.com
48
42one outlier and skewedSourav Ghosh
souravghosh@hotmail.com
49
43
How the peak will help in observation? is there anything do you see in peak value?
The peak, skew etc are methods for us to understand the data better. Having an intuitive sense of what the data is and what effect will be goes long way in building a good model.Rajiv
krajiv.2018@gmail.com
50
44LA = 34.0522° N, 118.2437° W SF =37.7749° N, 122.4194° W (yes I confirm)
Domenico Fioravanti
nicodom@gmail.com
51
45What does capping at 15 means?That the income was capped at 15 ..all the values greater than 15 was capped to 15 …Hope it answers your questionAswin Sabaaree
aswin.sabaaree@innovatia.net
52
46what is the reason for spike?Sugandhita
sugandhitap@gmail.com
53
47does X has to be normally distributed or Y or both?TYpically Y is. X usually is notSourav Ghosh
souravghosh@hotmail.com
54
48
Caped means that the value grated then the caped value are converted to the caped value like median_house_value > 500000 will be converted to 500000
DIvya Pathak
dev.feb88@gmail.com
55
49why is it easier to perform ML on normal data?We will learn this in the upcoming chaptersNini Nursiah
nursiah.neelesh28@gmail.com
56
50How the ratio 80/20 was chosen? Is there a reason?depends on the size of data ..we use either 80:20 or 70:30
Domenico Fioravanti
nicodom@gmail.com
57
51
why is outliers removed if they a good in numbers. Is it not a problem of data collection?
Outliers is often removed to make modelling easier. Outliers are a real thing in most data collection exercisesKrishna Mohan
kmiitan96@gmail.com
58
52How do we convert attributes to bell curve shaped ?All attributes may not be convertibile to a bell shaped curveSrihari M
srihariblr12@gmail.com
59
53
How to decide %s taken fro Training set Vs Test Set? Like you took 80% for training and 20% for test set. what are various things I need to think for it.
Typically we choose 20% for the test set. For large dataset we can choose less % for test data.Rajiv
krajiv.2018@gmail.com
60
54There is a terminology Out of Time validation. Is it same thing as Test set?The out-of-time validation sample contains data from an entirely different time period or customer campaign than what was used for model development. Validating model performance on a different time period is beneficial to further evaluate the model's robustness.Puneet Rsstogi
puneetrstg@gmail.com
61
55What does capping at 15 means?Aswin Sabaaree
aswin.sabaaree@innovatia.net
62
56
how do you know which data goes to training set and which data will go to test set?
live answeredStuti Rastogi
e0498211@u.nus.edu
We want the same type of data in training and test. The best way is often to do some random distribution
63
57
Does the Caping means that the value grated then the caped value are converted to the caped value like median_house_value > 500000 will be converted to 500000 ?
That is correctDIvya Pathak
dev.feb88@gmail.com
64
58@TAs - When do we use MSE and when do we use RMSE?Srini, I am not sure if there is a perfect answer. Both of them are a measure of the error. The MSE may penalise the error more severely than the RMSE, which may help some times and maybe a detriment other times.Srini Boddu
siliconfish@yahoo.com
65
59Why is training not performed on 100% of data?Because you need to test quality of the model you create. If you used it for training, the model will usually show a good result on thatNikhil Sharma
nikhilthemacho@gmail.com
If you train on 100% data, you will have no data left to evalute your model on how it is performing.
66
60As a human we can be bias but how come a ML algorithm is bias ?If ML algorithm will learn from the biased data then ofcourse the final model will be biased :)Vikas Bhartiya
ghivikas@gmail.com
67
61What are different mechanism to split data? is Hash most famous?We will cover diff techniques nowRajiv
krajiv.2018@gmail.com
68
62time based observationsSourav Ghosh
souravghosh@hotmail.com
69
63what is biased?VED
parmarvedpro5@gmail.com
70
64
The training and test data might have same row if we use the ramdom permutation?
No the training data and test data are disjoint.Nini Nursiah
nursiah.neelesh28@gmail.com
71
65
but can't the splitting be done initially and then models be run with same trainign and test set?
Preedesh M
Preedesh@gmail.com
72
66Why 42 is passed to random.seed?It is just a random number, you can choose a different one.
Anantpadmanabh Divanji
apgd14@gmail.com
You can use another number. The results may vary a little.
We can pass anny number as randomm seed …but make sure to use that number through out your model training process
73
67Seed can be any values or based on any observation we need to consider ?It can be any valueSatya Sunil
kvv.satyasunil@gmail.com
74
68ok thanks PraveenAnmol Khopade
anmolck@gmail.com
75
69what exactly is Hash?Puneet Rsstogi
puneetrstg@gmail.com
A hash is a function that converts one value to another. Hashing data is a common practice in computer science and is used for several different purposes. Examples include cryptography, compression, checksum generation, and data indexing
76
70what is the relative advantages of using md5 vis-a-vis the other one pls?Typically any hash that randomises sufficiently will do the workSourav Ghosh
souravghosh@hotmail.com
77
71what is hash. Please elaboratehttps://en.wikipedia.org/wiki/Hash_function

Hashes are used to create a random representation of the data. This removes any systematic grouping like all data from the same area may be together.
Sugandhita
sugandhitap@gmail.com
78
72why can't we just select the last 20% of the dataset for test set?The last 20% may come from the same area and may not be representative of the entire data setNini Nursiah
nursiah.neelesh28@gmail.com
Then the data used for training the model will not be inclusive.
79
73why do we need to append data at the end?Aakash Sinha
post2aakash@gmail.com
80
74
can more comments be added into the python file to explain the code section ? reason is there are code sections which are alternative ways to achieve similar goal.
noted the feedbackPrakhar Prasad
prakhar.prasad@gmail.com
81
75do we need Id column to create if we use train_test_split() ?If we are using scikit-learn function then no ..it will split based on the row index I believeNitin Nigam
nknigam@gmail.com
82
76
what is the best way for getting the train and test data . Creating own function or using Sckit Learn Function
using scikit-learn functionDIvya Pathak
dev.feb88@gmail.com
You can do both. Use the Scikit Function is easier and usually better
83
77Can we see an example of sampling bias?Hi Sourav, is your question answered now? Prof just explained it again for Jia’s questionSourav Ghosh
souravghosh@hotmail.com
84
78sampling bias arises due to the way we collect the data right?yesNini Nursiah
nursiah.neelesh28@gmail.com
85
79
Like in CNN's cant we take entire data and split it into 80-20 ratio where 80% is training and 20% validation and we refine based on testing accuracy. Random just seems too uncontrollable even with larger dataset
Anmol Khopade
anmolck@gmail.com
86
80
how do you take a sample of 100 out of 200 million adult population? But I do realize that they take the samples, from the whole population set? What is the process to in corporate the whole population?
Its not possible to incorporate the entire population usually. So you try to sample a smaller set for your excercise.Srini Boddu
siliconfish@yahoo.com
87
81
This sampling bias exists when we accquired the data or it also exits while splitting test and train data?
We are going to show thisPuneet Rsstogi
puneetrstg@gmail.com
88
82how is stratified samples different from cluster samplesKrishna Mohan
kmiitan96@gmail.com
89
83what is sampling bias in the example shown ? please explainjia sharma
jiavidhi.sharma@gmail.com
90
84I got my answer - ThanksSrini Boddu
siliconfish@yahoo.com
91
85How do we measure if a status is large enough?
Domenico Fioravanti
nicodom@gmail.com
92
86If know data in advance, we can categories the income
Veeru(VeeraNancharaiah Javvaji)
jsriveeru@gmail.com
93
87for future data, how can ?For near future term we can assume that the income will stay in similar range.
Veeru(VeeraNancharaiah Javvaji)
jsriveeru@gmail.com
94
88*stratus
Domenico Fioravanti
nicodom@gmail.com
95
89
How do the prof determine the the income as the measure for stratified sampling? or did he pick up randomly?
Srini Boddu
siliconfish@yahoo.com
96
90
Wouldn’t it be better to use bins=[0., 2, 3.0, 4.5, 6., np.inf], to increase the size of the first stratus?
Domenico Fioravanti
nicodom@gmail.com
97
912 to 6 has mostAG
abhijeetgadgil@gmail.com
98
92so how to represent 2 to 6 in right mannerHi AG, did not get your question ..can you please elaborateAG
abhijeetgadgil@gmail.com
99
93
If I look at bins we say 0 to 1.5 and then 1.5 to 3, but why are bins picked in 1.5 ranges in this example
There are a couple of way we can determine the optimum bin size. First, we find the smallest and largest data point, lower the minimum a little and raise the maximum a little, decide how many bins you need, divide your range (the numbers in your data set) by the bin size, and finally create the bin boundaries. Second is you can use Sturge’s Rule, which is K = 1 + 3. 322 logN, where K is the number of class intervals, N is the number of observations.AG
abhijeetgadgil@gmail.com
100
94why was numerical converted to categorical? I didn't understand that partCategorical was converted numerical, to enable modellingNini Nursiah
nursiah.neelesh28@gmail.com