Mathematical and computational simulators are instrumental in modeling complexscientific and industrial problems. A prevalent difficulty across domains isidentifying parameters that lead to certain experimental observations.We will consider a stochastic simulator $f(\theta)=x,$ capable of producingobservations $x$ from a set of parameters $\theta.$ This simulator enables thegeneration of samples from the likelihood $p(x \mid \theta),$ which is typicallyeither intractable or unavailable.Evaluating this simulator on various parameter configurations yields a dataset$\mathcal{D} = {x_1, \dots, x_n}$ and a corresponding empiricaldistribution $p(x_o).$ With Sourcerer, [Vet24S]introduces a strategy to determine a distribution $q(\theta)$ that, whenprocessed through the simulator, results in the empirical push-forwarddistribution $q^{#}(x)$ given by:$$q^{#}(x) = \int_{\Theta} p(x \mid \theta)q(\theta)d\theta$$A well-known tactic for estimating the source distribution is empiricalBayes, which refines the parameters $\phi$ of the prior by maximizing themarginal likelihood:$$p(\mathcal{D}) = \prod_i \int p(xi \mid \theta)q{\phi}(\theta) d\theta$$However, this approach is inadequate for simulators where the likelihood isunavailable or where the problem of parameter estimation is ill-posed.Maximum Entropy Source Distribution EstimationThe authors choose between competing source distributions taking the one thatmaximizes entropy. Intuitively this it the one that embodies the greatest levelof ignorance: The entropy $H(p)$ of a distribution $p$ is defined as $H(p) =-\int p(\theta)\ \log p(\theta)\ d\theta.$ To identify the maximum entropysource distribution, [Vet24S] propose to maximize$H(q),$ subject to the constraint that the push-forward distribution $\intp(x\mid \theta) q(\theta) d\theta$ equals the empirical distribution $p(x_o).$Figure 1. [Vet24S],Figure 2. The Sourcer framework optimizes $\phi$ of a parametric distribution$q_{\phi}(\theta)$ in order to maximize its entropy while yielding apush-forward distribution similar to the observed distribution w.r.t.Sliced-Wasserstein distance.As shown by the authors, optimizing the entropy of $q(\theta)$ yields a uniquesource distribution, if it exists. To implement it, they relax the functionalequality constraint with a penalty term, leading to the unconstrained problem$$\max \left { \lambda H(q) - (1-\lambda) \log(D(q^{#}, p_o)^2) \right },$$where $D(q^{#}, po)$ measures the discrepancy between the push-forward andthe empirical distributions, and $\lambda$ controls the penalty strength. Theauthors propose to use the Sliced-Wasserstein distance, as it is sample-basedand circumvents the direct evaluation of the likelihood. The logarithmic termenhances numerical stability.Incorporating the Bayesian perspective, with prior information about the source,the authors substitute the entropy term $H(q)$ with the Kullback-Leibler (KL)divergence between the estimated source distribution $q(\theta)$ and the initialprior $p(\theta).$ Doing so can be seen as regularizing the solution to stayclose to the prior. Since the KL divergence includes the entropy $H(q)$ andthe cross entropy $H(q,p),$ this new formulation remains amenable tosample-based estimation and the initial intention to include a large portion ofthe parameter space. The optimization thus becomes a balance between the KLdivergence and the discrepancy measure:\begin{align}& \lambda D{\text{KL}}(q \Vert p) + (1 - \lambda) \log \left( D(q^{#},p_o)^2\right) \= - & \lambda H(q) + \lambda H(q,p) + (1 - \lambda) \log \left( D(q^{#},p_o)^2\right).\end{align}In the second line, the KL-divergence is expressed in terms of entropy andcross-entropy between the source and the prior distribution.To approximate the source distribution $q(\theta),$ the authors utilizeunconstrained artificial neural networks, as presented by [Van21N].Numerical ExperimentsFigure 3. [Vet24S],Figure 4. Comparison of thetrue and estimated source distribution (left) and the observed data against thepush-forward distribution (right).The authors validate their method through detailed numerical examples. First,they benchmark their approach against the two moons, the inverse kinematics(IK), and the simple likelihood complex posterior task (SCLP). All threepresented by [Van21N] specifically for empirical Bayes.They further demonstrate their algorithm’s effectiveness on complex scenariosusing differentiable simulators, specifically the Lotka-Volterra and SIR models,showcasing the method’s adaptability and strength in diverse simulationcontexts. Finally, the authors apply Sourcer to the Hodgkin Huxley model, awell-known neuron model, to estimate the source distribution of the model’sparameters. In all cases, they use the Classifier-2-Sample-Test to evaluate thequality of push-forward distributions, obtained using the estimated source(Figure 2).The numerical experiments showcase the method’s ability to accurately estimatea source distribution, that yields a push-forward distribution very close to theobserved data. Furthermore, the source distribution show a greater level ofentropy, as desired (Figure 2).Figure 2. [Vet24S],Figure 3. The choice of $\lambda$ is an importanthyperparameter. The plots show the amount of entropy and the quality of thepush-forward distribution w.r.t. the choice of $\lambda.$