ΑΙhub 前天 18:34
Interview with Debalina Padariya: Privacy-preserving generative models
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文采访了AAAI/SIGAI博士生院的参与者Debalina Padariya,探讨了她在隐私保护生成模型方面的研究。Debalina的研究重点在于构建一个量化生成模型驱动的合成数据集的隐私/效用权衡框架。她详细介绍了研究进展,包括对隐私保护生成模型的文献综述、对合成表格数据的隐私风险和效用的实证研究,以及针对时间序列应用的新型隐私保护合成数据生成方法。Debalina还分享了她在AAAI/SIGAI博士生院的经历,以及对AI领域博士生的建议,并分享了她作为一名博士生和母亲的有趣生活。

💡Debalina的研究核心在于隐私保护生成模型,旨在解决合成数据生成(SDG)中的隐私问题。她致力于量化生成模型驱动的合成数据集的隐私与效用之间的权衡。

📚Debalina的研究包括对隐私保护生成模型进行全面的文献综述,并设计了分类法以理解不同隐私和效用指标的异同。她还进行了实证研究,探讨了生成模型生成的合成表格数据的隐私风险和效用。

⏳Debalina开发了一种新型的隐私保护合成数据生成方法,专为时间序列应用设计。该方法构建了一个可扩展的框架,集成了不同的模块,并使用差分隐私保证的防御措施。

⚖️Debalina的研究揭示了隐私和效用之间的权衡关系。例如,差分隐私虽然能有效降低泄露风险,但也可能降低合成数据的统计质量和下游性能。她正在努力寻找合适的指标来评估合成时间序列数据的效用。

🎓Debalina分享了她在AAAI/SIGAI博士生院的经历,以及她对攻读AI领域博士学位的建议。她强调了博士阶段的长期承诺,以及导师、同伴和研究社区的重要性。

In this interview series, we’re meeting some of the AAAI/SIGAI Doctoral Consortium participants to find out more about their research. In this latest interview, we hear from Debalina Padariya and hear about her work on Privacy-Preserving Generative Models, why this is such an interesting area for study, the different projects she’s been involved in so far during her PhD, and her experience at the Doctoral Consortium at AAAI 2025.

Tell us a bit about your PhD – where are you studying, and what is the topic of your
research?

I am currently pursuing a PhD at De Montfort University, UK, supported by the prestigious Alan Turing Institute and Accenture Strategic Partnership Program. My research primarily focuses on Privacy-Preserving Generative Models, while designing a framework to quantify the privacy/utility trade-offs in generative model-driven synthetic datasets. Although Synthetic Data Generation (SDG) is one of the emerging use cases of generative AI, potential privacy attacks associated with generative models emerge as critical issues. My research investigates the state-of-the-art privacy metrics in generative models (GMs) while considering the limitations of existing privacy-preserving approaches. This research aims to develop a novel privacy-preserving framework that will contribute to the practical advancements of synthetic data generation across industry and the public sector.

Could you give us an overview of the research you’ve carried out so far during your PhD?

For the first phase of the research, I comprehensively reviewed the literature on privacy- preserving GMs. I have proposed a systematic literature survey that offers an in-depth analysis of an exhaustive list of publications to map the current landscape. Additionally, I have designed novel taxonomies to categorise the privacy and utility metrics of GMs, allowing us to understand the similarities and differences of different metrics. The pre-print of my comprehensive literature survey is now available at arXiv: Privacy-Preserving Generative Models: A Comprehensive Survey.

In the second phase of my research, I conducted an empirical study focusing on the privacy risks and utility of synthetic tabular data generated by state-of-the-art generative models. This study addresses both privacy vulnerabilities and the impact of differential privacy mechanisms, with potential implications for fields that rely on synthetic data. I presented the findings of this study at the Women in Machine Learning Symposium (WiML2024), ICML 2024. Overall, my empirical findings indicate that while the popular approach, differential privacy, holds great promise for disclosure control and quantifying the privacy risk of synthetic data, it involves trade-offs that must be carefully considered. Another important aspect of SDG is its reliability in the current regulatory landscape, where the worldwide regulatory efforts aim to establish the ethical standard in the development of generative AI. I analysed how the use of generative model-based synthetic datasets intersects with emerging regulatory compliances, and presented my findings at the 2nd ICML Workshop on Generative AI and Law (GenLaw ’24): Navigating Risks and Rewards of Generative Model-based Synthetic Datasets: A Regulatory Perspective.

In the third phase, I developed a novel methodology focused on privacy-preserving synthetic data generation for time-series applications. I have designed an extensible framework by integrating the components and the interfaces by realising different candidate modules, such as generative models, attacks, and defences. I rely on defences that provide state-of-the-art differential privacy guarantees, such as noise perturbation and gradient clipping strategies. Also, I proposed a new membership inference attack on synthetic time-series data in white-box settings to demonstrate the resistance of this framework to attacks. Further, I have performed an extensive evaluation to assess the performance of synthetic data using different qualitative and quantitative utility metrics, demonstrating its effectiveness with the SOTA (state-of-the-art) model. I presented the preliminary findings of my proposed framework in the AAAI 2025 Workshop on AI for Public Missons. Moreover, my PhD thesis proposal was accepted at the 2025 AAAI/SIGAI Doctoral Consortium, and is now available at A Privacy-Preserving Framework for Generative Model-driven Synthetic Datasets | Proceedings of the AAAI Conference on Artificial Intelligence.

Is there an aspect of your research that has been particularly interesting?

One interesting aspect of my research has been investigating privacy–utility trade-offs in synthetic data. This area is compelling because increasing privacy, particularly through mechanisms like differential privacy, often comes at the cost of data accuracy. So far, my findings revealed that while differential privacy can effectively mitigate disclosure risks, while providing higher levels of privacy protection, it often significantly degrades the statistical quality and downstream performances of the synthetic data. This poses critical challenges for practitioners and policymakers in fields like healthcare and finance, where synthetic data is increasingly used to enable data sharing while preserving confidentiality. Navigating this trade-off is crucial for advancing trustworthy AI systems. Another interesting aspect of my research is selecting the appropriate evaluation metrics for synthetic time-series datasets. Since various metrics can lead to diverse trade-offs, and there is currently no clear consensus on the most suitable evaluation metrics for assessing synthetic time-series data, this remains an important challenge in the field.

What are your plans for building on your research so far during the PhD – what aspects will you be investigating next?

While privacy-preserving data publishing has seen notable success in image-based applications, other critical domains, such as time series data used in healthcare, finance, and weather forecasting, have not been significantly explored. My research particularly focuses on privacy-preserving synthetic data generation for sequential applications, where handling temporal dependencies is challenging and often requires complex models. I have already developed a novel privacy-preserving framework for time-series applications while evaluating the performance of synthetic data using suitable privacy and utility metrics. For the next step, I will evaluate the privacy/utility trade-offs that allow consistent comparison of privacy and utility values with different SDG configurations. Additionally, my research will focus on identifying appropriate utility metrics, determining when their scores are meaningful and when they may be prone to misinterpretation.

Debalina presenting her poster at the AAAI/SIGAI Doctoral Consortium

How was the AAAI/SIGAI Doctoral Consortium, and the AAAI conference experience in general?

Attending the AAAI/SIGAI Doctoral Consortium is one of the best experiences during my PhD journey. The organising committee designed the programme schedule very thoughtfully to support the PhD students. One of the highlights was the mentor–mentee session, where each group was paired with experienced experts from both academia and industry. Their advice on how to approach the transition after the PhD was incredibly valuable. There were also lunch and dinner gatherings organised by the program committee, where we had the chance to connect with fellow researchers from diverse backgrounds. I strongly encourage that PhD students should consider attending a doctoral consortium at least once during their research journey.

Attending the AAAI Conference was a truly valuable experience for me. The standout moment was attending a talk by Dr. Andrew Ng, the world-famous AI scientist. I also had the opportunity to speak at the Women in AI Diversity and Inclusion Workshop, where I engaged in insightful discussions with top industry experts in this field. The conference also featured a job fair offering a chance to connect with potential employers about career opportunities. This was also my first experience in the United States, and attending one of the leading conferences in AI made it a memorable moment for me!

What advice would you give to someone thinking of doing a PhD in the field?

AI is one of the fastest-growing fields today, where doing a PhD in AI is incredibly demanding! Keep in mind – pursuing a PhD is a long-term commitment. The journey can be a rollercoaster ride, where your mental strength, dedication, and self-motivation will be challenged more than your intelligence. Additionally, as an international student doing a PhD abroad, you might face additional challenges, including being away from family support systems. However, this journey also offers immense personal and professional growth, as you push boundaries and develop resilience. One of the key elements of your PhD journey is your supervisors. Their guidance, feedback, and critical evaluation are central to shaping your growth as a researcher. Surround yourself with peers, engage with the wider research community, and share ideas and struggles with others. Attending conferences, workshops, or summer schools can also be an enriching experience, providing opportunities to gain valuable insights and build meaningful collaborations.

Could you tell us an interesting (non-AI-related) fact about you? 

I am originally from India, a country known for its rich tradition and deep-rooted values in education and innovation. While considering a PhD as my second baby, I’m also a full-time mum to a wonderful little boy! Spare time is a luxury as a PhD student and a parent, but with the constant support of my amazing husband, who shares parenting duties and encourages my academic goals, we make the most of it by spending quality time together. I enjoy cooking Indian cuisine, drawing, playing, and hanging out with my son, and the best part—trying to answer all his never-ending "why" questions about the world! He finds it amusing that his mum is still “studying just like him”—and I think that’s sparked a shared love for learning in both of us!

Debalina Padariya

I am a PhD research student in the Department of School of Computer Science and Informatics at De Montfort University, UK, advised by Dr Aboozar Taherkhani (De Montfort University), Prof Eerke Boiten (De Montfort University), and Dr Isabel Wagner (University of Basel, Switzerland). My research is supported by Accenture and the Alan Turing Institute Strategic Partnership Program, UK. My PhD focuses on the Privacy-Preserving Generative Models and Quantifying the trade-offs between privacy and utility. So far in my PhD journey, I have been invited as a Panel member in International Women’s Day at De Montfort University, UK, received the PhD student Ambassador Bursary at the Security and Privacy Conference SPRITE+, Belfast, Northern Ireland, served as a Volunteer at the Women in Machine Learning Symposium at ICML 2024, Vienna, Austria, invited as a Speaker at Women in AI Diversity and Inclusion Workshop at AAAI 2025, and served as an Organizing Committee member at the Post Graduate Research Conference at De Montfort University, UK. Beyond research, I have ten years of Academic Experience as an Assistant Professor in India. I am also a member of the Operational Research Society, UK, the Indian Society for Technical Education (ISTE), and the Computer Science Teachers Association (CSTA).

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

隐私保护 生成模型 合成数据 AI研究
相关文章