Truth Versus Harm

The Newsroom interview: the root of conflicting opinions — some agreeing with McAvoy’s approach and some referring to it as “demeaning”. It all depends on what form of ethics a person follows…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Why Probability is important in Data Science and Machine Learning?

Among the many fields and branches of Mathematics, probability plays an important role in both Artificial Intelligence and Data Science.

When implementing machine learning algorithms, you may have come across situations where the environment that your algorithm is in, is non-deterministic, i.e., you cannot guarantee the same output always for the same input. Similarly in the real-world, there are scenarios such as these where the behavior can vary, though the input remains the same. Uncertainty exists no matter what. As machine learning includes humongous amounts of data, multiple hyperparameters, and a complex environment, uncertainties are bound to exist. It can be in the form of missing variables, incomplete modeling, or the data being probabilistic.

Probability can be defined as the likeliness of something to occur or happen. Every time we need to explain what is the change of some outcome or an event to occur, we talk in terms of Probability.

The way to calculate the probability of the occurrence of an event is as follows:

Probability represents the certainty factor. Certainty is the rate that you would assign to an event to happen. Say, you are rolling a dice and you say that the certainty with which a 6 shows up on the dice is ⅙. It means there’s a 16.67% chance that a 6 shows up on the dice. That’s the certainty you allot to that particular event. This, in turn, is known as probability, or precisely, in our case, it’s called frequentist probability.

The frequentist probability denotes the frequency with which the event can happen amongst many trials/events. Rolling a dice is frequentist as ⅙ means that out of infinitely many trials of rolling a dice, there’s a 1/6th chance that 6 is going to show up. Not all scenarios are frequency related as in our previous assumption. If we consider a machine learning problem in which we estimate the probability of inflation or deflation of the price of fuel, we wouldn’t be thinking this in the perspective of repetition, as seen in the frequentist probability scenario. Instead, we say that this event could occur with a certain probability/certainty.

Conditional probability is defined as the likelihood of an event or outcome occurring, based on the occurrence of a previous event or outcome. Conditional probability is calculated by multiplying the probability of the preceding event by the updated probability of the succeeding, or conditional, event.

Conditional Probability refers to the chances that some outcome occurs given that another event has also occurred. It is often stated as the probability of B given A and is written as P(B|A), where the probability of B depends on that of A happening.

Conditional Probability Formula is given as:

In a card game, suppose a player needs to draw two cards of the same suit in order to win. Of the 52 cards, there are 13 cards in each suit. Suppose first the player draws a heart. Now the player wishes to draw a second heart. Since one heart has already been chosen, there are now 12 hearts remaining in a deck of 51 cards. So the conditional probability P(Draw second heart|First card a heart) = 12/51.

Another method to find out Conditional Probability is Bayes’ Theorem

Bayes’ Theorem explains a method to find out conditional probability. This theorem is named after the 18th-century British Mathematician Thomas Bayes, who discovered this theorem. We know, Conditional Probability can be explained as the probability of an event’s occurrence concerning one or multiple other events. This mathematical formula has been widely used in Machine Learning for Modeling Hypotheses, Classification, and Optimization.

For two events, A and B Bayes’ Theorem states:

Examples of Bayes’ Theorem used in practice in machine learning and data science are:

It would be fair to say that probability is required to effectively work through a machine learning predictive modeling project. Machine learning is about developing predictive models from uncertain data. Uncertainty means working with imperfect or incomplete information. Uncertainty is fundamental to the field of machine learning, yet it is one of the aspects that causes the most difficulty for beginners, especially those coming from a developer background. There are three main sources of uncertainty in machine learning, they are: noisy data, incomplete coverage of the problem domain and imperfect models.

Add a comment

Related posts:

i like chicken nuggets

I like chicken nuggets Not merely are the saltiness and crispness prevalent with every bite The chicken itself serves as a base for whichever versatile, tangy sauce you choose to indulge As merely a…

KILLING COUGH SYRUP

Maiden Pharmaceuticals Limited a Delhi based Pharmaceutical company that exported defective cough syrup to Gambia, is said to be the prim cause of death of 66 children. It is said that this company…

IntroToGitHub

If you are someone who doesn’t know how to use GitHub, then this blog is for you. Github is a web-based platform used for version control. Git simplifies the process of working with other people and…