少点错误 02月03日
Can we infer the search space of a local optimiser?
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨如何识别和理解一个疑似局部优化系统的行为。该系统在表面状态空间s中表现出变化,但其优化过程可能发生在未知的隐藏空间h中。文章提出,通过观察系统状态s随时间演化的轨迹,并假设隐藏空间h中的优化步骤是局部的,可以尝试推断出从s到h的映射关系。文章还探讨了隐藏空间h是否可以被描述为度量空间,以及如何利用自编码器等技术来揭示这种映射,从而验证系统是否为局部优化器。

🤔 系统状态的演变:系统状态s随时间演化,表面上是st到st+1的变化,但实际上可能经历了一个隐藏空间h,即st→ht→ht+1→st+1的过程,其中ht是隐藏空间H中的状态。

🔍 局部优化假设:假设ht→ht+1是一个局部优化过程,即系统试图最小化一个评估函数f(ht),通过类似梯度下降等局部方法在隐藏空间H中进行搜索。

💡 推断隐藏空间:通过观察足够多的系统状态st,并利用局部搜索轨迹的特性,可以尝试推断出st到ht的映射关系,例如训练自编码器来学习st到h't的映射,并最小化h't与h't+1之间的距离。

📏 度量空间假设:文章讨论了隐藏空间H是否可以被描述为度量空间,因为许多局部搜索算法都具有空间中点之间距离的概念,这对于推断隐藏空间的结构非常重要。

Published on February 3, 2025 10:17 AM GMT

Suppose we have a system that we suspect is some form of local optimiser. Maybe it's gradient descent training a neural network. Maybe it’s a neural network doing in-context learning. Maybe it's a mind, because we guess that the operation of minds in general is to an extent well-described as local movement in some 'abstraction space' a lot of the time. Maybe it’s something else.

Is there, on paper, a general way:

    To check whether our guess that the system is a local optimiser is correct?To find the space in which the optimiser is taking local steps?

Setup

Concretely, say a system has a state vector , which we can see.  evolves over time steps , so let’s call it . Our guess is that the system evolves  through some local optimisation method. But it’s not really local in -space. It’s local in some other space we don’t know about. Let’s call the state vector in that hidden space we don’t know about , where  is some space.  seems guaranteed to be a topological space, else the idea of a local search in it wouldn't make much sense. Maybe in a lot of cases we can also assume that it is a metric space? Not sure about that one, but a lot of the local search algorithms I can think of seem to have some notion of distance between points in the space.

So, overall the system maps  as , where the transformations  and  can be complicated and nonlinear.

 is assumed to be a step in some local optimisation method. That is to say, there is some evaluation function  that the system is trying to minimise, and it tries to do this minimisation via something like a genetic algorithm, gradient descent, Adam, or something else that makes use of local information to take one step in the space  per time step . The evaluation function  should have the kind of  loss landscape a local optimiser could reasonably navigate. No parity learning problems or anything like that. It also shouldn’t be completely trivial to find the minimum of . I.e. if the optimiser is using Newton’s method shouldn’t just be quadratic everywhere, else there isn’t much of an optimisation trajectory for us to look at because the system always finds the minimum in one step.

Now, what I’m wondering about is this: Is there some characteristic attribute of the trajectories local optimisations take in their search spaces which we can use to figure out the map  with reasonable confidence, provided we have enough states  from different searches to look at?

Example of what this could look like

I'm not claiming the following would actually work, I'm just trying to point to the sort of thing my brain is thinking about by giving a concrete example.

In a lot of cases I can think of, neighbouring points in the time series of a local search will tend to be close to each other in terms of Euclidean distance in the search space. So, we could try to find a map  such that   will tend to be small. We could do this by training an autoencoder with a deep and expressive architecture on a loss of 

    Reconstructing the original system outputs. So, learn some map .Minimising some term like , with , where the standard deviation  and expectation  are taken over the batch.

And then maybe the map  we can now explicitly see in the autoencoder would tend to roughly match the map  we're looking for, up to trivial transformations like affine transforms. And if the autoencoder training can't converge with good loss, then maybe that means the original system wasn't a local optimiser, or at least not a local optimiser taking small steps in a Euclidean space.

Summary

    Should we expect local search spaces to often be describable as metric spaces, for the kind of local searches we might often see in real life? Local searches run by in-context learning algorithms or other minds, for example.If a local optimiser maps , with  being the states in the local search space, can we infer the map  from seeing enough states ,[1] by leveraging our knowledge that the time series  has to look like the trajectory of a local search?
  1. ^

    Up to affine transformations and such of course.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

局部优化 隐藏空间 自编码器 度量空间 优化轨迹
相关文章