Today there’s an abundance of textbooks and webbooks on Bayesian probability theory, decision theory, and statistics, at very diverse technical levels. I wanted to point out three books whose main topic is not probability theory, but which give very good introductions (even superior to those of some specialized textbooks, in my opinion) to Bayesian probability theory:

  • Artificial Intelligence: A Modern Approach by S. J. Russell, P. Norvig. Part IV is an amazing introduction to Bayesian theory – including decision theory – with many connections with Artificial Intelligence and Logic.

  • Medical Decision Making by H. C. Sox, M. C. Higgins, D. K. Owens. This is essentially a very clear and insightful textbook on Bayesian probability theory and decision theory, but targeted to clinical decision-making.

  • Sentential Probability Logic: Origins, Development, Current Status, and Technical Applications by T. Hailperin. This is a book on Bayesian probability theory, presented as a generalization of propositional logic. This point of view is the most powerful I know of. The books also has important results on methods to find probability bounds, and on combining evidence.

  • Salamander@mander.xyz
    link
    fedilink
    English
    arrow-up
    3
    ·
    1 year ago

    Thank you for these suggestions!

    I’ve recently been thinking that it would be worth investing the effort to learn the underlying mathematical principles and logic of artificial intelligence. I do have a superficial understanding and I’m not terrible with math, so Artificial Intelligence: A Modern Approach might be what I need.

    • stravanasuOPM
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      1 year ago

      Cheers, I’m happy if they’re helpful!

      The first Parts of Artificial Intelligence are very thorough: they explain the notions and ideas, give examples, explain why we should use those ideas instead of others (mentioning criticisms and so on), and give a taste of the history leading to them, with references. Very scholarly and satisfying.

      Surprisingly I found Part V of the book, on machine learning, very poor and even in contradiction with the earlier parts. It’s mainly a list of boldface terms (“just memorize this!”) with hand-waving explanations and outdated literature.

      The book also shows that some current practices in machine learning are misguided or even wrong. For example the negligence of the utilities involved in classification problems when training or evaluating classifiers (if I may do some self-promotion, we discussed these issues here).

      • Salamander@mander.xyz
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        1 year ago

        Thanks!

        The book also shows that some current practices in machine learning are misguided or even wrong. For example the negligence of the utilities involved in classification problems when training or evaluating classifiers (if I may do some self-promotion, we discussed these issues here).

        This is great, because understanding how current practices fail is good evidence of having a strong understanding of the underlying principles. As I mentioned, my knowledge in this field is still very superficial, so please don’t expect me to be able to make an intelligent comment soon, but I do appreciate the self-promotion, as it is cool to see that we have someone with deep knowledge in this field around here.

        Earlier this week my girlfriend approached me with the problem of writing a python program to count the number of fluorescently labeled cells in some images she took with a fluorescence microscope. It turns out that this is a problem that has been solved many times by many labs, but every lab’s samples and images are a bit different - often different enough such that every lab has to write their own programs to run their analysis.

        What I have found is that AI solutions are being developed (the most popular for cell counting seems to be a U-Net neural network) such that researchers will no longer have to write their own programs, but it looks like they will still need to learn how to prepare the correct training and validations sets so that they can actually train their model to work with their specific kind data.

        For the cell counting problem, the classification that I am concerned with is “these pixels belong to unique cell #N”, and I suspect that this type of classification problem is a lot simpler than the more general kinds of classification that AIs deal with. So perhaps the pitfalls of classifiers that you explore in your paper do not affect this limited type of models. But maybe they do - once I learn more about the topic I will find out :-)

        • stravanasuOPM
          link
          fedilink
          English
          arrow-up
          2
          ·
          edit-2
          1 year ago

          I believe that “superficial knowledge” is the best starting point :) It gives you more opportunities to understand things with a fresh look, and see things that people with “less superficial” knowledge don’t see anymore. Because unfortunately there’s always a process of routine-building and fossilization going on in parallel with knowledge acquisition.

          The fluorescence problem sounds cool! The point is that every classification involves some uncertainty, which in some cases can’t be reduced under a certain amount, no matter how many training data one uses. Extreme example: even if we give quadrillions of training data about an ordinary tossed coin to some algorithm, the algorithm will never be able to get more than 50% right at the next toss. So, training is good, but it only addresses part of the problem, and only up to a certain point.

          What becomes important next, is how good or bad it is to make the wrong decision/classification, owing to the uncertainty. In some situations it’s best to avoid predicting something that’s not the case (say, a pixel that’s not a target cell); in others it’s best to be on the safe side (better to have some false pixels than to lose even just one target cell). What’s best is very problem-dependent, even when the data are the same. It depends on what further use the classification will have. This aspect is, at present, something that training processes don’t take into account very much.

          Artificial Intelligence gives a very insightful perspective and a systematic framework for the whole problem :)

          Well, sorry for the babble – good luck with the research!

          • Salamander@mander.xyz
            link
            fedilink
            English
            arrow-up
            2
            ·
            1 year ago

            Extreme example: even if we give quadrillions of training data about an ordinary tossed coin to some algorithm, the algorithm will never be able to get more than 50% right at the next toss.

            This makes me think about stock traders who are trying to build AI models to optimize their trading strategy. I can’t say that they are wrong, and sure, why not, I encourage them to try… But I think they are dealing with a very similar problem to a coin toss.

            Well, sorry for the babble – good luck with the research!

            Oh, don’t apologize for that!! And thank you :D