Dan Rose AI | Applied AI Blog 2024年11月26日
Don’t be data-driven in AI
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了在人工智能领域,过度依赖数据驱动的决策可能带来的问题,并主张以好奇心作为驱动AI项目的主要动力。作者认为,过分依赖现有数据会导致决策偏差,忽略潜在的盲点,而好奇心则能引导我们探索未知,发现更具价值的信息。文章通过举例说明数据可能存在误导性,并强调好奇心带来的积极影响,例如激发热情,促进创新。最终,作者呼吁AI从业者应将好奇心作为核心驱动力,以更积极主动的方式解决问题,推动人工智能领域的发展。

🤔 **数据驱动决策的局限性:**文章指出,过分依赖数据驱动决策可能导致决策偏差,因为我们通常无法获取所有相关数据,即使获取了也可能存在误差和误导。例如,统计学家罗纳德·费舍尔曾错误地认为肺癌与吸烟无关,正是因为过度依赖当时的数据分析。

🧐 **好奇心驱动更具价值:**作者认为,AI项目的核心在于解决问题或优化流程,而这些问题的解决方案并不一定存在于现有的数据中。因此,应以好奇心为导向,探索未知,发现数据中的盲点,并针对这些盲点进行深入研究。

💡 **好奇心带来的积极影响:**好奇心能够激发探索的热情,使AI项目充满活力。当我们带着好奇心去探索问题时,会更容易克服项目中的挑战和枯燥环节,最终取得更大的成功。

🚀 **主动探索而非被动反应:**文章强调,数据驱动本质上是一种被动的反应,而好奇心驱动则是一种主动的探索。在解决问题和推动创新方面,主动探索更为有效,它要求我们勇于面对未知,并以好奇心为指引。

🤔 **数据可能被操纵:**数据分析结果可能受到主观因素的影响,甚至被有意或无意地操纵,导致得出错误的结论。因此,我们不能盲目相信数据,而要保持独立思考和批判性思维。

Being data-driven is usually used and understood with positive connotations but when I hear the word used I get a little anxious about the “data-driven” decisions that might be about to happen. Let me explain why.

According to Wikipedia data-driven means “The adjective data-driven means that progress in an activity is compelled by data, rather than by intuition or by personal experience.” In other words - Look at the data as a primary source of information to act on. When the data gives you a reason to act, you act. It might at a glance seem like a very sound way to work and especially in the AI domain, that in so many ways rely on data. But in fact being data-driven can be very problematic when working with AI. I actually think people that say they are data-driven in general are on the wrong track. This does not mean that I’m against putting much effort into understanding your data. I’m actually a big believer that collecting, understanding and preparing data for AI projects should be the activities with the most resources allocated to it. So I’m pro good data science but against being driven by data and I see that as two very different things.

But then why is it so problematic to be data-driven?

My primary argument is that the driver behind decision making and activities should not be the data you have, but rather curiosity on the problem and the world around it. In a sense that would mean being driven by data you don't have. The final goal of AI projects often is to solve a problem or improve a process and the solutions to that do not always exist in the data you have generated or are being generated by the current world's solutions. So instead you should be curiosity-driven or at least problem-driven. This means that you should not approach problems by looking at your data and making a conclusion. You should look at your data and look for the blindspots and from there be curious. What is it that you don’t know? I’ll get back to curiosity later. First I have some more arguments against being data-driven.

You will extremely rarely have all relevant data to a problem. Even after exhausting all potential data sources. So when you make conclusions from the data you have, the conclusion will at least always be a bit off. This doesn’t mean that data is not useful and that the conclusion is not useful, but you will always be at least a bit wrong. As statisticians would say:"all models are wrong but some are useful". 

Another problem with being data-driven is that there's a narrative that decisions made on data is better than decisions made on gut feeling. And while that might be true sometimes, data is not one-sized and can be very helpful at times and very misleading at others. 

An example is the father of modern statistics Ronald Fischer that also in hindsight was a little too data-driven. He stubbornly held to his conclusion that data showed that lung cancer was not a result from smoking. The correlation he said must be the other way around and people with lung cancer or higher risk of lung cancer was just more likely to be smokers. He argued that it was either a genetic relation or that cancer patients would use smoking to soothe pain in lungs.So even the best statisticians can be told stories that are far from the truth by data. 

The last problem with data is its ability to tell you the story you want it to tell. That can be done consciously or unconsciously. A famous quote by the economist Ronald Case goes “If you torture the data long enough, it will confess to anything” so there no certainty that the conclusion you get from data is correct. The interpretation can be very biased and sometimes we torture the data even without being aware of it ourself. 

About curiosity

So as promised I’m getting back to being curious. If I had to choose one keyword to succeed with AI it would be curiosity. AI projects usually start with a process to optimize or a problem to solve and before training a model on data you have to be curious about the problem. In that way the data comes subsequently to the problem and will as a result be more relevant and more specific to the problem. 

Curiosity to me means exploring with as little preconception as possible. The best example for me is when children lift up rocks on the ground just to see what is under the rock. If you ever saw a child doing that you will have seen that there is no expectations, only excitement both before and after the rock and been lifted. And that is exactly what curiosity does to the practitioner. It leads to excitement that in turn leads to passion. Passion makes everything much easier and even the tedious parts of a project will feel effortless.

AI is also explorative in its nature and that’s why it suits so well to be curious. If there's specific expectations in an explorative process then disappointed is almost given. 

As a result you must let curiosity be the primary driver behind the decisions and activities you make. Being data-driven is reactive in nature and if you want to be innovative in solving problems you must be proactive. Being proactive requires you to be curious about your blind and be driven by the unknown.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

数据驱动 好奇心 人工智能 AI项目 决策
相关文章