Communications of the ACM - Artificial Intelligence 01月09日
We’re Reaping Just What We Sowed
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了理论计算机科学与实践计算机工程之间的差异。理论模型往往无法完全捕捉现实系统的复杂性,导致理论与实践之间出现错配。文章以排序算法和网络协议为例,说明了理论上的最优算法在实际应用中可能并非最佳选择,以及TCP协议在理论上的不可靠性在实际应用中被视为“可靠”。文章还讨论了生成式AI的出现,进一步加剧了这种错配,并呼吁将实践计算机工程与形式计算机科学区分开来,强调理解现实复杂性的重要性。

🧮 理论计算机科学的抽象模型可能无法完全捕捉现实系统的复杂性,导致理论与实践之间存在差异。例如,理论上的排序算法复杂度分析与实际应用中Quicksort的效率存在差异。

🌐 计算机网络中,TCP协议被普遍认为是“可靠”的,尽管其底层网络协议并非如此。这种“可靠性”是一种实用定义,旨在简化应用开发,但可能会忽略潜在的错误和数据损坏风险。

🤖 生成式AI的出现进一步加剧了理论与实践的错配,因为其基于统计和随机方法,难以保证完全的可靠性。尽管存在错误,但在某些应用中,生成式AI被认为是“足够好”的,这与我们对TCP的“可靠”的定义类似。

🎓 计算机科学教育应区分形式计算机科学和实践计算机工程,前者有助于理解和分析后者,但可能无法完全预测其行为。我们需要承认并教授实践中存在的复杂性,而不是一味追求理论的简洁性。

Theoretical Computer Science is a glass slipper that has never quite fit the foot of practical Computer Engineering. The essence of their marriage is that theoreticians devise a formal model which captures some essential features of reality, and then apply reasoning about it to implementations. The problems occur when the models do not capture enough of the structure of real systems to explain and predict their behavior.

Some of the earliest issues in modeling computation arise in dealing with finiteness. The problem with finite computational models is that they are all equivalent in expressive power and also quite limited. This creates two issues: the inability to solve irregular problems on inputs of unbounded size (see the Pumping Lemma), and the possibility of highly specific effects that arise due to the precise numerical bound on size (as is found in finite group theory).

The idea of computing using an abstract machine model (for example, push-down automata and Turing machines) that can grow during the execution of an algorithm leads to a theory of computation that is quite rich. The implications of such models can apply to real-world computers, as long as resource utilization does not exceed their physical limitations. Even when those bounds are reached, there is still the question of what could in future be computed on machines of ever-greater size and speed. However, when even futuristic physical limitations and issues like power consumption are addressed, the correspondence between the infinitary models and reality starts to fray.

A widely understood example of this divergence can be found in the application of the theory of algorithmic complexity to sorting. The classical analysis of sorting yields the well-known result that, in the worst case, the number of comparison or data movement operations required to sort a list of N integers is bounded by some constant multiple of N*log(N). However, the most commonly used algorithm, Quicksort, requires quadratic resources in the worst case (meaning some multiple of N2). A multiple of N2 is greater than any multiple of N*log(N) for large-enough N, so in the unbounded theory of  complexity, Quicksort is considered “less efficient” than any number of algorithms, such as Insertion Sort, that can be implemented using at most a multiple of N*log(N) resources.

There are many reasons for this mismatch between theory and practice, including:

    The pattern of memory and register usage in Quicksort means that it is extremely efficient on moderately sized datasets (locality).In many realistic applications, it is unlikely that the worst case scenario of maximum resource utilization for Quicksort will occur, and in some cases Quicksort is able to terminate very quickly (a characteristic known as “early stopping”).In practice, many sorting utilities evaluate the nature of their input and adaptively use a number of different algorithms as alternatives or in concert to try and obtain the best result. This includes the use of Bubble Sort, an algorithm which has terrible worst-case efficiency but is well adapted to very small datasets.When datasets must be moved from secondary storage or across a communication network in order to be sorted, the resources required for data movement (which is linear, or a multiple of N) in practice may overshadow those taken for sorting.

In spite of the complexity of this reality, the abstract theory of sorting is still taught because of its simplicity and power. It is still common to teach students that algorithms with high resource utilization on unbounded inputs should be avoided, or may be impossible to use in practice. We then rely on their unlearning these early lessons either in practical data management classes or on the job. The truth is considered too complex and does not lend itself to abstract proofs, a skill we are eager to teach.

I came across a more current example of such a mismatch when teaching an undergraduate Computer Networking course. One of the fundamental lessons of Network Architecture is that packet networks are not reliable: packets sent may be lost, may be corrupted in transit, may be delayed without bound, and may be misdelivered. There is also a theorem of distributed systems which tells us it is impossible to use an unreliable network to implement one that is completely reliable. But Computer Networking textbooks universally describe the Internet’s Transport Control Protocol (TCP) as “reliable.”

The truth is complex. TCP implements a different error model than the Internet Protocol, using a number of tools including checksums, sequence numbering, and retransmission to reduce the probability of data corruption or misdelivery. TCP also introduces new error modes (such as a “broken connection”) which do not exist in the Internet’s underlying packet delivery model. Together, these are thought to reduce the probability of error to the point that most common applications are sufficiently stable, and this is considered a usable definition of “reliable.”

Application developers regularly use TCP as if it were completely reliable, although its 16-bit checksum makes “silent data corruption” highly probable in sufficiently large data transfers. Developers working in application areas such as scientific computing, where such errors are considered unacceptable, understand the issue and use longer end-to-end file hashes and retransmission to overcome errors. However, in fields such as multimedia streaming where occasional errors are thought to be acceptable, it is common to simply consider TCP to be “reliable enough.” The same problems haunt storage and computation at large scale, and are only addressed when the mismatch causes unacceptable application failures. It is worth noting that, because of the highly “discontinuous” nature of programmable systems, unanticipated errors can be the cause of massive security failures or seemingly unrelated aberrant system behavior.

Today we are faced with Generative AI, a new subfield of computer engineering that offers us the use of trained behaviors known to be error-prone. Generative AI may be considered “reliable enough” for some applications, at least initially. The relationship between Generative AI and the real world is unknown, because of the use of statistical and randomized methods. The application of more reliable guardrail requires the use of classical AI tools based on formal logic. These are not keeping up with the prolific ability of Generative AI to copy and guess.

The defense of Generative AI that it is “good enough” to be used  in a world already filled with human and machine error echoes the universal response of my networking students that my objection to calling TCP “reliable” is pedantic and old-fashioned. They say the word “reliable” has simply been given a new meaning. Or that “everyone knows” that the “reliable” mechanism must be checked (although this fact is not taught, and is not widely understood). Mainly, they want to keep the subject matter simple and understandable so they can learn to program using an intuitive API, get a grade, and a job. And the writers of textbooks and the developers of modern university curricula seem to agree with them.

So please, everyone who reacts in horror to the acceptance of “hallucinations” and other errors in generative AI, stop clutching your pearls. We have developed a field (computer science) based on the cynical acceptance of a mismatch between theory and practice. We have done this in order to validate our own education and the history of the field to date.

It is important that we teach practical computer engineering as a field separate from formal computer science. The latter can help in the understanding and analysis of the former, but may never model it well enough to be predictive in the way the physical sciences are.

Micah D. Beck (mbeck@utk.edu) is an associate professor at the Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, TN, USA.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

理论计算机科学 实践计算机工程 TCP 生成式AI 可靠性
相关文章