<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en"><generator uri="https://jekyllrb.com/" version="4.3.4">Jekyll</generator><link href="https://anshumansuri.com/feed.xml" rel="self" type="application/atom+xml"/><link href="https://anshumansuri.com/" rel="alternate" type="text/html" hreflang="en"/><updated>2026-02-27T17:52:00+00:00</updated><id>https://anshumansuri.com/feed.xml</id><title type="html">blank</title><subtitle>A simple, whitespace theme for academics. Based on [*folio](https://github.com/bogoli/-folio) design. </subtitle><entry><title type="html">Advice for working on ML projects</title><link href="https://anshumansuri.com/blog/2025/ml-considerations/" rel="alternate" type="text/html" title="Advice for working on ML projects"/><published>2025-07-17T00:00:00+00:00</published><updated>2025-07-17T00:00:00+00:00</updated><id>https://anshumansuri.com/blog/2025/ml-considerations</id><content type="html" xml:base="https://anshumansuri.com/blog/2025/ml-considerations/"><![CDATA[<p>Working on ML projects in academia (and beyond) often feels like a constant battle between moving fast to test ideas and maintaining enough organization to actually make progress.</p> <p>Based on my experiences, working with collaborators who have diverse coding backgrounds, and—perhaps most importantly—browsing through GitHub repositories of varying quality, I’ve picked up practices and design patterns that have genuinely transformed how I approach ML projects. These aren’t abstract software engineering principles; they’re tested techniques that have saved me from countless headaches and helped me move faster while making fewer mistakes.</p> <p>In this blog, I’ll document the lessons that have made the biggest difference in my day-to-day research workflow. Some might seem obvious in hindsight, others might challenge how you currently organize your work. Either way, I hope they help you spend less time wrestling with logistics and more time focused on the actual science. Note that this is a living document<d-footnote>This means I will update it every now and then based on things I learn</d-footnote>—I’m constantly learning new tricks, and I’ll add them here as I discover what works.</p> <h1 id="structuring-your-codebase">Structuring your Codebase</h1> <p>Some experiments are straightforward and can be self-contained in a file or two. However, most ML projects that span a few weeks or more often end up with growing codebase sizes, with lots of reusable content that can bloat the overall project and lead to subtle inconsistencies when running experiments for different setups.</p> <p>Let’s say you’ve developed a new form of adversarial training and want to run experiments for varying perturbation strengths—including a baseline without any defense. Your project folder might look like this:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>experiments/
├── standard_training.py
├── adversarial_training.py
</code></pre></div></div> <p>Now, during your standard training run, you notice the learning rate is too high (say you started with <code class="language-plaintext highlighter-rouge">1e-3</code>) and reduce it to <code class="language-plaintext highlighter-rouge">1e-4</code>, which fixes the issue. However, since you have separate files for adversarial and standard training, you forget to push the same update to the other file. Your experimental runs now differ not just in the presence/absence of adversarial training, but also in the optimizer hyperparameters—which can have non-trivial impacts on learning dynamics and final results.</p> <p>This example might seem minor, but with growing project sizes and hyperparameters, it’s easy to see how things can go wrong quickly. A straightforward solution would be to have a single <code class="language-plaintext highlighter-rouge">training.py</code> file and support standard training by setting the perturbation budget <code class="language-plaintext highlighter-rouge">epsilon</code> to 0 (or some other sentinel value). It could look something like:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">if</span> <span class="n">args</span><span class="p">.</span><span class="n">epsilon</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
  <span class="c1"># Standard training
</span>  <span class="nf">train</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">optim</span><span class="p">,</span> <span class="n">data_loader</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
  <span class="c1"># Adversarial training
</span>  <span class="nf">adv_train</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">optim</span><span class="p">,</span> <span class="n">data_loader</span><span class="p">,</span> <span class="n">args</span><span class="p">.</span><span class="n">epsilon</span><span class="p">)</span>
</code></pre></div></div> <p>This ensures the <code class="language-plaintext highlighter-rouge">model</code>, <code class="language-plaintext highlighter-rouge">optim</code>, and any other common components are used exactly the same way for both experiments. This approach is intuitive once you think about it, but I’ve seen many researchers<d-footnote>I've been guilty of this at several occasions.</d-footnote> and GitHub projects (especially academic ones) fall into the code duplication trap.</p> <p>Going further with this example, I’ve also seen the equivalent of <code class="language-plaintext highlighter-rouge">adversarial_training_eps4.py</code> in the example above—creating duplicate files with nearly identical code and minor differences (mostly hyperparameters or datasets). This compounds the diverging changes problem and makes it hard to track what’s actually different between experiments.</p> <p>This “single point of failure” approach, in my opinion, is actually useful for research (as long as you catch the bugs, of course). For instance, let’s say all your files use some common evaluation function:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">evaluate</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">loader</span><span class="p">):</span>
  <span class="n">acc</span> <span class="o">=</span> <span class="mi">0</span>
  <span class="k">for</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span> <span class="ow">in</span> <span class="n">loader</span><span class="p">:</span>
    <span class="n">y_pred</span> <span class="o">=</span> <span class="nf">model</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>
    <span class="n">acc</span> <span class="o">+=</span> <span class="p">(</span><span class="n">y_pred</span> <span class="o">==</span> <span class="n">y</span><span class="p">).</span><span class="nf">sum</span><span class="p">()</span>
  <span class="n">model</span><span class="p">.</span><span class="nf">train</span><span class="p">()</span>
  <span class="k">return</span> <span class="n">acc</span> <span class="o">/</span> <span class="nf">len</span><span class="p">(</span><span class="n">loader</span><span class="p">)</span>
</code></pre></div></div> <p>There are two big issues here:</p> <ul> <li>the accuracy calculation is incorrect: <code class="language-plaintext highlighter-rouge">len(loader)</code> gives the number of batches, not the dataset size, and</li> <li>the model is returned to training mode after evaluation but is never set to eval mode before evaluation begins. This can be especially problematic when the model has data-dependent layers like batch normalization that accumulate statistics.</li> </ul> <p>When the researcher catches this issue, they can at least be assured that whatever mistake they made invalidates all their experiments equally (requiring a complete redo), rather than having the same function in another file, correcting it only there, and making incorrect conclusions about which technique works better.</p> <h2 id="pip-it">PIP-it!</h2> <p>As your codebase grows and you start working on multiple related projects, you’ll inevitably find yourself copy-pasting utility functions, model implementations, and evaluation scripts across different repositories. Let’s say you’ve developed a novel membership inference attack for your latest paper. Six months later, you’re working on a different project and want to use that same attack as a baseline or evaluation metric. What do you do? Copy the files over? Git submodule? Reimplement it from scratch because you can’t find the exact version that worked?</p> <p>This is where creating a proper Python package from your research code can help. Not only does it make your life easier when reusing code across projects, but it also makes it significantly more likely that others will actually use your research. Think about it: would you rather</p> <ul> <li>😵‍💫 download a ZIP file, dig through someone’s experimental scripts, and try to figure out which functions are reusable, or would you prefer to</li> <li>😺 simply <code class="language-plaintext highlighter-rouge">pip install</code> their package and import the functions you need?</li> </ul> <p>The latter is much more appealing, and higher adoption of your methods means more impact. Here’s how this evolution typically looks. You start with a project structure like this:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>membership_inference_project/
├── train_target_model.py
├── run_mia_attack.py
├── utils.py  <span class="c"># Data loading, metrics, plotting</span>
└── models.py <span class="c"># Target models and attack models</span>
</code></pre></div></div> <p>But then you realize that your attack implementation in <code class="language-plaintext highlighter-rouge">models.py</code> is generic enough that others could use it. Instead of letting this code rot in a single project folder, you can structure it as a proper package:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mia_toolkit/
├── setup.py
├── README.md
├── mia_toolkit/
│   ├── __init__.py
│   ├── attacks/
│   │   ├── __init__.py
│   │   └── membership_inference.py
│   ├── data/
│   │   ├── __init__.py
│   │   └── loaders.py
│   └── utils/
│       ├── __init__.py
│       └── metrics.py
</code></pre></div></div> <p>With a minimal <code class="language-plaintext highlighter-rouge">setup.py</code>, you can now install this directly from GitHub. Note that I used the edit option to install the package with <code class="language-plaintext highlighter-rouge">-e</code> above: this is particularly useful for packages currently under development or when you want to make minimal changes to the code and don’t want to reinstall the package every time you change something!</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># In your new project
</span><span class="n">pip</span> <span class="n">install</span> <span class="o">-</span><span class="n">e</span> <span class="n">git</span><span class="o">+</span><span class="n">https</span><span class="p">:</span><span class="o">//</span><span class="n">github</span><span class="p">.</span><span class="n">com</span><span class="o">/</span><span class="n">yourusername</span><span class="o">/</span><span class="n">mia</span><span class="o">-</span><span class="n">toolkit</span><span class="p">.</span><span class="n">git</span>

<span class="c1"># Clean imports in your code
</span><span class="kn">from</span> <span class="n">mia_toolkit.attacks</span> <span class="kn">import</span> <span class="n">MembershipInferenceAttack</span>
<span class="kn">from</span> <span class="n">mia_toolkit.data</span> <span class="kn">import</span> <span class="n">load_private_dataset</span>
</code></pre></div></div> <p>The benefits extend beyond just your own convenience. When other researchers want to compare against your method, they don’t need to reverse-engineer your experimental scripts—they can simply install your package and focus on the science. This dramatically lowers the barrier to adoption and increases the likelihood that your work will be built upon by others.</p> <div class="alert alert-info" role="alert"> That being said, don't go overboard with this. Not every 50-line script needs to become a package, and there's a delicate balance between making functions generic enough for reuse versus specific enough to actually be useful for your research. I typically package code when I find myself copy-pasting the same utilities across 2-3 projects, or when I think the methods are novel enough that others might want to use them as baselines. </div> <p>A few practical notes: keep your package dependencies minimal and well-documented. I personally try to maintain one conda environment for most of my work, creating new ones only when external baselines require very specific package versions that would otherwise create conflicts.</p> <h2 id="dataclasses-are-your-friend">Dataclasses are your friend</h2> <p>I’ll be honest—this is one of those “do as I say, not as I did” moments. If you look at some of my <a href="https://github.com/suyeecav/model-targeted-poisoning/blob/342f35f7d1204c3a61e84b48c143ec819a55374c/dnn/mtp_dnn.py#L235">older projects</a>, you’ll see argument parsing that looks like this:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">parser</span><span class="p">.</span><span class="nf">add_argument</span><span class="p">(</span><span class="sh">'</span><span class="s">--data_dir</span><span class="sh">'</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">str</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="sh">'</span><span class="s">data</span><span class="sh">'</span><span class="p">)</span>
<span class="n">parser</span><span class="p">.</span><span class="nf">add_argument</span><span class="p">(</span><span class="sh">'</span><span class="s">--model_arch</span><span class="sh">'</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">str</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="sh">'</span><span class="s">ResNet18</span><span class="sh">'</span><span class="p">)</span>
<span class="n">parser</span><span class="p">.</span><span class="nf">add_argument</span><span class="p">(</span><span class="sh">'</span><span class="s">--lr</span><span class="sh">'</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">float</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="mf">0.1</span><span class="p">)</span>
<span class="n">parser</span><span class="p">.</span><span class="nf">add_argument</span><span class="p">(</span><span class="sh">'</span><span class="s">--momentum</span><span class="sh">'</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">float</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="mf">0.9</span><span class="p">)</span>
<span class="n">parser</span><span class="p">.</span><span class="nf">add_argument</span><span class="p">(</span><span class="sh">'</span><span class="s">--weight_decay</span><span class="sh">'</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">float</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="mf">5e-4</span><span class="p">)</span>
<span class="n">parser</span><span class="p">.</span><span class="nf">add_argument</span><span class="p">(</span><span class="sh">'</span><span class="s">--batch_size</span><span class="sh">'</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">int</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="mi">128</span><span class="p">)</span>
<span class="n">parser</span><span class="p">.</span><span class="nf">add_argument</span><span class="p">(</span><span class="sh">'</span><span class="s">--epochs</span><span class="sh">'</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">int</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="mi">200</span><span class="p">)</span>
<span class="n">parser</span><span class="p">.</span><span class="nf">add_argument</span><span class="p">(</span><span class="sh">'</span><span class="s">--poison_lr</span><span class="sh">'</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">float</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="mf">0.1</span><span class="p">)</span>
<span class="n">parser</span><span class="p">.</span><span class="nf">add_argument</span><span class="p">(</span><span class="sh">'</span><span class="s">--poison_momentum</span><span class="sh">'</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">float</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="mf">0.9</span><span class="p">)</span>
<span class="n">parser</span><span class="p">.</span><span class="nf">add_argument</span><span class="p">(</span><span class="sh">'</span><span class="s">--poison_epochs</span><span class="sh">'</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">int</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="mi">50</span><span class="p">)</span>
<span class="n">parser</span><span class="p">.</span><span class="nf">add_argument</span><span class="p">(</span><span class="sh">'</span><span class="s">--target_class</span><span class="sh">'</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">int</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span>
<span class="n">parser</span><span class="p">.</span><span class="nf">add_argument</span><span class="p">(</span><span class="sh">'</span><span class="s">--poison_fraction</span><span class="sh">'</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">float</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="mf">0.1</span><span class="p">)</span>
<span class="n">parser</span><span class="p">.</span><span class="nf">add_argument</span><span class="p">(</span><span class="sh">'</span><span class="s">--save_model</span><span class="sh">'</span><span class="p">,</span> <span class="n">action</span><span class="o">=</span><span class="sh">'</span><span class="s">store_true</span><span class="sh">'</span><span class="p">)</span>
<span class="n">parser</span><span class="p">.</span><span class="nf">add_argument</span><span class="p">(</span><span class="sh">'</span><span class="s">--cuda_visible_devices</span><span class="sh">'</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">str</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="sh">'</span><span class="s">0</span><span class="sh">'</span><span class="p">)</span>
<span class="n">parser</span><span class="p">.</span><span class="nf">add_argument</span><span class="p">(</span><span class="sh">'</span><span class="s">--random_seed</span><span class="sh">'</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">int</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span>
<span class="c1"># ... and about 15 more arguments
</span></code></pre></div></div> <p>This gets unwieldy fast, and worse, it’s error-prone. What if you have both <code class="language-plaintext highlighter-rouge">args.lr</code> and <code class="language-plaintext highlighter-rouge">args.poison_lr</code>? It’s easy to accidentally use the wrong one in your code, especially when you’re debugging at 2 AM<d-footnote>Old habits: Bryan Johnson and Matthew Walker have convinced me to improve my sleeping habits. You should too- it makes a big difference!</d-footnote>.</p> <p>My favorite go-to for these situations is <a href="https://github.com/lebrice/SimpleParsing">SimpleParsing</a>—a wrapper around argparse that leverages Python’s dataclass functionality. Instead of the mess above, you can structure your arguments hierarchically:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="n">dataclasses</span> <span class="kn">import</span> <span class="n">dataclass</span>
<span class="kn">from</span> <span class="n">simple_parsing</span> <span class="kn">import</span> <span class="n">ArgumentParser</span>

<span class="nd">@dataclass</span>
<span class="k">class</span> <span class="nc">TrainingConfig</span><span class="p">:</span>
    <span class="sh">"""</span><span class="s">Configuration for model training</span><span class="sh">"""</span>
    <span class="n">lr</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="mf">0.1</span>                    <span class="c1"># Learning rate for optimizer
</span>    <span class="n">momentum</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="mf">0.9</span>              <span class="c1"># Momentum for SGD optimizer  
</span>    <span class="n">weight_decay</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="mf">5e-4</span>         <span class="c1"># L2 regularization strength
</span>    <span class="n">batch_size</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">128</span>              <span class="c1"># Training batch size
</span>    <span class="n">epochs</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">200</span>                  <span class="c1"># Number of training epochs
</span>
<span class="nd">@dataclass</span>
<span class="k">class</span> <span class="nc">PoisonConfig</span><span class="p">:</span>
    <span class="sh">"""</span><span class="s">Configuration for poisoning attack</span><span class="sh">"""</span>
    <span class="n">lr</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="mf">0.1</span>                    <span class="c1"># Learning rate for poison optimization
</span>    <span class="n">momentum</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="mf">0.9</span>              <span class="c1"># Momentum for poison optimizer
</span>    <span class="n">epochs</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">50</span>                   <span class="c1"># Poison optimization epochs  
</span>    <span class="n">fraction</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="mf">0.1</span>              <span class="c1"># Fraction of dataset to poison
</span>    <span class="n">target_class</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">0</span>              <span class="c1"># Target class for attack
</span>
<span class="nd">@dataclass</span>
<span class="k">class</span> <span class="nc">ExperimentConfig</span><span class="p">:</span>
    <span class="sh">"""</span><span class="s">Overall experiment configuration</span><span class="sh">"""</span>
    <span class="n">data_dir</span><span class="p">:</span> <span class="nb">str</span> <span class="o">=</span> <span class="sh">"</span><span class="s">data</span><span class="sh">"</span>             <span class="c1"># Path to dataset directory
</span>    <span class="n">model_arch</span><span class="p">:</span> <span class="nb">str</span> <span class="o">=</span> <span class="sh">"</span><span class="s">ResNet18</span><span class="sh">"</span>       <span class="c1"># Model architecture to use
</span>    <span class="n">save_model</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="bp">False</span>           <span class="c1"># Whether to save trained model
</span>    <span class="n">random_seed</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">0</span>               <span class="c1"># Random seed for reproducibility
</span>
<span class="n">parser</span> <span class="o">=</span> <span class="nc">ArgumentParser</span><span class="p">()</span>
<span class="n">parser</span><span class="p">.</span><span class="nf">add_arguments</span><span class="p">(</span><span class="n">TrainingConfig</span><span class="p">,</span> <span class="n">dest</span><span class="o">=</span><span class="sh">"</span><span class="s">training</span><span class="sh">"</span><span class="p">)</span>
<span class="n">parser</span><span class="p">.</span><span class="nf">add_arguments</span><span class="p">(</span><span class="n">PoisonConfig</span><span class="p">,</span> <span class="n">dest</span><span class="o">=</span><span class="sh">"</span><span class="s">poison</span><span class="sh">"</span><span class="p">)</span> 
<span class="n">parser</span><span class="p">.</span><span class="nf">add_arguments</span><span class="p">(</span><span class="n">ExperimentConfig</span><span class="p">,</span> <span class="n">dest</span><span class="o">=</span><span class="sh">"</span><span class="s">experiment</span><span class="sh">"</span><span class="p">)</span>

<span class="n">args</span> <span class="o">=</span> <span class="n">parser</span><span class="p">.</span><span class="nf">parse_args</span><span class="p">()</span>
</code></pre></div></div> <p>Now you can run your script with clear, hierarchical arguments:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>python train.py <span class="nt">--training</span>.lr 0.01 <span class="nt">--poison</span>.lr 0.1 <span class="nt">--experiment</span>.data_dir /path/to/data
</code></pre></div></div> <p>The benefits are immediately obvious. No more confusion between <code class="language-plaintext highlighter-rouge">args.lr</code> and <code class="language-plaintext highlighter-rouge">args.poison_lr</code>—it’s now <code class="language-plaintext highlighter-rouge">args.training.lr</code> versus <code class="language-plaintext highlighter-rouge">args.poison.lr</code>. The hierarchy makes it crystal clear which learning rate you’re referring to, and the docstrings serve double duty as both code documentation and command-line help text.</p> <p>But the real magic happens when you start reusing these configurations across files. Instead of copy-pasting argument definitions (and inevitably introducing inconsistencies), you can simply import your dataclasses:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Your project structure</span>
...
├── configs.py       <span class="c"># All dataclass definitions</span>
├── train_model.py
├── evaluate_model.py
└── run_attack.py
</code></pre></div></div> <p>Each script can import exactly the configurations it needs:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># In train_model.py</span>
from configs import TrainingConfig, ExperimentConfig

<span class="c"># In run_attack.py  </span>
from configs import PoisonConfig, ExperimentConfig

<span class="c"># In evaluate_model.py</span>
from configs import ExperimentConfig
</code></pre></div></div> <p>This ensures that when you update the default learning rate in <code class="language-plaintext highlighter-rouge">TrainingConfig</code>, it’s automatically reflected across all scripts that use it. No more hunting through multiple files to make sure you’ve updated the same hyperparameter everywhere.</p> <p>SimpleParsing also handles saving and loading configurations to/from YAML or JSON files, which makes experiment reproduction trivial. Instead of trying to remember the exact command-line arguments you used three weeks ago, you can simply:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Save your current config to configs/experiment_1.yaml</span>
<span class="c"># Reproduce the exact same experiment later</span>
python train.py <span class="nt">--config_path</span> configs/experiment_1.yaml
</code></pre></div></div> <h1 id="evaluations">Evaluations</h1> <p>Efficient experiment management can mean the difference between spending hours (or days) babysitting jobs and actually having time to think and do research—using the right tools lets you offload the busywork and focus on what matters.</p> <h2 id="how-do-you-like-them-notifications">How do you like them notifications?</h2> <p>Picture this: you start a training run that’s supposed to take 6 hours, close your laptop, and go about your day. Six hours later, you eagerly check back expecting to see beautiful loss curves, only to discover your script crashed 20 minutes in due to a CUDA out-of-memory error 😱. Sound familiar?</p> <p>Most ML experiments take hours or even days to complete, and the traditional approach of estimating runtime with <code class="language-plaintext highlighter-rouge">tqdm</code> and checking the ETA only gets you so far. What you really need is to know the moment your experiment finishes—or more importantly, when it crashes.</p> <p><a href="https://github.com/huggingface/knockknock">knockknock</a> from HuggingFace has been an absolute lifesaver for this! It’s a simple Python package that sends you notifications when your experiments complete or fail. The setup is straightforward:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pip <span class="nb">install </span>knockknock
</code></pre></div></div> <p>You can use it as a decorator directly in your code but honestly, I prefer the command-line approach since it doesn’t require modifying your existing code. You can set up a simple wrapper script in your <code class="language-plaintext highlighter-rouge">~/bin</code> directory:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/bash</span>
<span class="c"># Save this as ~/bin/knocky and make it executable with chmod +x</span>
<span class="c"># Example below is for Telegram</span>
knockknock telegram <span class="se">\</span>
    <span class="nt">--token</span> YOUR_TELEGRAM_TOKEN <span class="se">\</span>
    <span class="nt">--chat-id</span> YOUR_CHAT_ID <span class="se">\</span>
    <span class="s2">"</span><span class="nv">$@</span><span class="s2">"</span>
</code></pre></div></div> <p>Now you can run any experiment with notifications by simply prefixing your command:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Instead of: python train_resnet.py --epochs 200</span>
knocky python train_resnet.py <span class="nt">--epochs</span> 200
</code></pre></div></div> <p>The beauty is that you get notifications both when your script completes successfully and when it crashes with an error. No more checking in every few hours or trying to estimate completion times. I personally use Telegram<d-footnote>Setup details for tokens and bot ID here: https://github.com/huggingface/knockknock?tab=readme-ov-file#telegram</d-footnote> since it’s reliable and I always have it on my phone, but knockknock supports Slack, Discord, email, and several other platforms.</p> <p>This simple change has saved me countless hours of babysitting experiments (or logging in anxiously every 1-2 hours). Plus, there’s something deeply satisfying about getting a notification that your model finished training while you’re grabbing coffee or on your way to work.</p> <h2 id="like-a-magic-wandb">Like a magic WAND(b)</h2> <p>Remember when comparing different experimental runs meant opening multiple terminal windows, squinting at loss values printed to stdout, and trying to remember which combination of hyperparameters gave you that promising result from last Tuesday? Or frantically searching through your bash history because you forgot the exact arguments you used for your best-performing model?</p> <p>I used to have training scripts that would dump metrics to text files, create matplotlib plots locally, and leave me manually tracking which experiment was which:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">train_epoch</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">loader</span><span class="p">,</span> <span class="n">optimizer</span><span class="p">,</span> <span class="n">epoch</span><span class="p">,</span> <span class="n">exp_name</span><span class="p">):</span>
    <span class="n">total_loss</span> <span class="o">=</span> <span class="mi">0</span>
    <span class="k">for</span> <span class="n">batch_idx</span><span class="p">,</span> <span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">target</span><span class="p">)</span> <span class="ow">in</span> <span class="nf">enumerate</span><span class="p">(</span><span class="n">loader</span><span class="p">):</span>
        <span class="c1"># ... training code ...
</span>        
        <span class="c1"># Manual logging (the old way)
</span>        <span class="k">if</span> <span class="n">batch_idx</span> <span class="o">%</span> <span class="mi">100</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
            <span class="nf">print</span><span class="p">(</span><span class="sa">f</span><span class="sh">'</span><span class="s">Epoch </span><span class="si">{</span><span class="n">epoch</span><span class="si">}</span><span class="s">, Batch </span><span class="si">{</span><span class="n">batch_idx</span><span class="si">}</span><span class="s">, Loss: </span><span class="si">{</span><span class="n">loss</span><span class="p">.</span><span class="nf">item</span><span class="p">()</span><span class="si">:</span><span class="p">.</span><span class="mi">4</span><span class="n">f</span><span class="si">}</span><span class="sh">'</span><span class="p">)</span>
            
            <span class="c1"># Dump to files for later analysis
</span>            <span class="k">with</span> <span class="nf">open</span><span class="p">(</span><span class="sa">f</span><span class="sh">'</span><span class="s">logs/</span><span class="si">{</span><span class="n">exp_name</span><span class="si">}</span><span class="s">_loss.txt</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">a</span><span class="sh">'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
                <span class="n">f</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span><span class="sa">f</span><span class="sh">'</span><span class="si">{</span><span class="n">epoch</span><span class="si">}</span><span class="s">,</span><span class="si">{</span><span class="n">batch_idx</span><span class="si">}</span><span class="s">,</span><span class="si">{</span><span class="n">loss</span><span class="p">.</span><span class="nf">item</span><span class="p">()</span><span class="si">}</span><span class="se">\n</span><span class="sh">'</span><span class="p">)</span>
            
            <span class="c1"># Save plots occasionally
</span>            <span class="k">if</span> <span class="n">batch_idx</span> <span class="o">%</span> <span class="mi">1000</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
                <span class="n">plt</span><span class="p">.</span><span class="nf">figure</span><span class="p">()</span>
                <span class="n">plt</span><span class="p">.</span><span class="nf">plot</span><span class="p">(</span><span class="n">losses</span><span class="p">)</span>
                <span class="n">plt</span><span class="p">.</span><span class="nf">savefig</span><span class="p">(</span><span class="sa">f</span><span class="sh">'</span><span class="s">plots/</span><span class="si">{</span><span class="n">exp_name</span><span class="si">}</span><span class="s">_loss_epoch_</span><span class="si">{</span><span class="n">epoch</span><span class="si">}</span><span class="s">.png</span><span class="sh">'</span><span class="p">)</span>
                <span class="n">plt</span><span class="p">.</span><span class="nf">close</span><span class="p">()</span>
</code></pre></div></div> <p>Then you end up with a mess of files like <code class="language-plaintext highlighter-rouge">resnet_lr001_wd0001_loss.txt</code> and <code class="language-plaintext highlighter-rouge">resnet_lr01_wd0005_loss.txt</code>, and good luck remembering which file corresponds to which exact experimental setup three weeks later.</p> <p>Enter <a href="https://wandb.ai/">Weights &amp; Biases (wandb)</a>—hands down the biggest<d-footnote>TensorBoard and MLflow are good alternatives too.</d-footnote> game-changer for my research workflow:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="n">wandb</span>

<span class="c1"># Initialize once at the start of your script
</span><span class="n">wandb</span><span class="p">.</span><span class="nf">init</span><span class="p">(</span>
    <span class="n">project</span><span class="o">=</span><span class="sh">"</span><span class="s">my-awesome-research</span><span class="sh">"</span><span class="p">,</span>
    <span class="n">config</span><span class="o">=</span><span class="p">{</span>
        <span class="sh">"</span><span class="s">learning_rate</span><span class="sh">"</span><span class="p">:</span> <span class="n">args</span><span class="p">.</span><span class="n">lr</span><span class="p">,</span>
        <span class="sh">"</span><span class="s">weight_decay</span><span class="sh">"</span><span class="p">:</span> <span class="n">args</span><span class="p">.</span><span class="n">weight_decay</span><span class="p">,</span>
        <span class="sh">"</span><span class="s">architecture</span><span class="sh">"</span><span class="p">:</span> <span class="n">args</span><span class="p">.</span><span class="n">model</span><span class="p">,</span>
        <span class="sh">"</span><span class="s">dataset</span><span class="sh">"</span><span class="p">:</span> <span class="n">args</span><span class="p">.</span><span class="n">dataset</span><span class="p">,</span>
        <span class="sh">"</span><span class="s">batch_size</span><span class="sh">"</span><span class="p">:</span> <span class="n">args</span><span class="p">.</span><span class="n">batch_size</span><span class="p">,</span>
        <span class="sh">"</span><span class="s">epochs</span><span class="sh">"</span><span class="p">:</span> <span class="n">args</span><span class="p">.</span><span class="n">epochs</span><span class="p">,</span>
    <span class="p">}</span>
<span class="p">)</span>

<span class="k">def</span> <span class="nf">train_epoch</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">loader</span><span class="p">,</span> <span class="n">optimizer</span><span class="p">,</span> <span class="n">epoch</span><span class="p">):</span>
    <span class="k">for</span> <span class="n">batch_idx</span><span class="p">,</span> <span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">target</span><span class="p">)</span> <span class="ow">in</span> <span class="nf">enumerate</span><span class="p">(</span><span class="n">loader</span><span class="p">):</span>
        <span class="c1"># ... training code ...
</span>        
        <span class="c1"># That's it! One line of logging
</span>        <span class="n">wandb</span><span class="p">.</span><span class="nf">log</span><span class="p">({</span>
            <span class="sh">"</span><span class="s">train/loss</span><span class="sh">"</span><span class="p">:</span> <span class="n">loss</span><span class="p">.</span><span class="nf">item</span><span class="p">(),</span>
            <span class="sh">"</span><span class="s">train/accuracy</span><span class="sh">"</span><span class="p">:</span> <span class="n">accuracy</span><span class="p">,</span>
            <span class="sh">"</span><span class="s">epoch</span><span class="sh">"</span><span class="p">:</span> <span class="n">epoch</span><span class="p">,</span>
            <span class="sh">"</span><span class="s">learning_rate</span><span class="sh">"</span><span class="p">:</span> <span class="n">optimizer</span><span class="p">.</span><span class="n">param_groups</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="sh">'</span><span class="s">lr</span><span class="sh">'</span><span class="p">]</span>
        <span class="p">})</span>

<span class="c1"># Automatically track your model's gradients and parameters
</span><span class="n">wandb</span><span class="p">.</span><span class="nf">watch</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">log_freq</span><span class="o">=</span><span class="mi">100</span><span class="p">)</span>
</code></pre></div></div> <div class="row mt-1"> <div class="col-sm mt-2 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/considerations/console-480.webp 480w,/assets/img/considerations/console-800.webp 800w,/assets/img/considerations/console-1400.webp 1400w," type="image/webp" sizes="95vw"/> <img src="/assets/img/considerations/console.png" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> </div> <p>The magic isn’t just in the simplicity of logging—it’s in what wandb does with that information. Every single run gets tracked with:</p> <ul> <li><strong>All your hyperparameters</strong>: No more “what learning rate did I use again?”</li> <li><strong>Real-time metrics</strong>: Plots update live as your model trains</li> <li><strong>System monitoring</strong>: GPU utilization, memory usage, CPU stats</li> <li><strong>Code tracking</strong>: Git commit hash, diff, and even the exact command you ran</li> <li><strong>Reproducibility</strong>: One-click to see the exact environment and arguments</li> </ul> <p>Instead of manually plotting loss curves from different text files, you can select multiple runs in the wandb interface and overlay their metrics instantly. Need to see how learning rate affects convergence? Select all runs with different LRs and compare their loss curves side-by-side. Want to find your best-performing model from the last month? Sort by validation accuracy and boom—there it is, with all the hyperparameters clearly listed.</p> <p>You can even log media directly:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Log images, plots, and even 3D visualizations
</span><span class="n">wandb</span><span class="p">.</span><span class="nf">log</span><span class="p">({</span>
    <span class="sh">"</span><span class="s">predictions</span><span class="sh">"</span><span class="p">:</span> <span class="n">wandb</span><span class="p">.</span><span class="nc">Image</span><span class="p">(</span><span class="n">prediction_plot</span><span class="p">),</span>
    <span class="sh">"</span><span class="s">confusion_matrix</span><span class="sh">"</span><span class="p">:</span> <span class="n">wandb</span><span class="p">.</span><span class="n">plot</span><span class="p">.</span><span class="nf">confusion_matrix</span><span class="p">(</span><span class="n">y_true</span><span class="p">,</span> <span class="n">y_pred</span><span class="p">,</span> <span class="n">labels</span><span class="p">),</span>
    <span class="sh">"</span><span class="s">sample_predictions</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="n">wandb</span><span class="p">.</span><span class="nc">Image</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="n">caption</span><span class="o">=</span><span class="sa">f</span><span class="sh">"</span><span class="s">Pred: </span><span class="si">{</span><span class="n">pred</span><span class="si">}</span><span class="sh">"</span><span class="p">)</span> 
                          <span class="k">for</span> <span class="n">img</span><span class="p">,</span> <span class="n">pred</span> <span class="ow">in</span> <span class="nf">zip</span><span class="p">(</span><span class="n">sample_images</span><span class="p">,</span> <span class="n">predictions</span><span class="p">)]</span>
<span class="p">})</span>
</code></pre></div></div> <p>The filtering and search capabilities are phenomenal too. You can filter runs by any combination of hyperparameters, metric ranges, or tags. Looking for all ResNet experiments with learning rate &gt; 0.01 that achieved &gt;90% accuracy? Just use the built-in filters. This has saved me countless hours of digging through experimental logs<d-footnote>The free tier gives you unlimited personal projects and up to 100GB of storage, which is more than enough for most academic work</d-footnote>.</p> <div class="row mt-1"> <div class="col-sm mt-2 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/considerations/filter-480.webp 480w,/assets/img/considerations/filter-800.webp 800w,/assets/img/considerations/filter-1400.webp 1400w," type="image/webp" sizes="95vw"/> <img src="/assets/img/considerations/filter.png" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> </div> <p>Since adopting wandb, I’ve never lost track of an experimental run, never forgotten which hyperparameters produced good results, and never had to manually create comparison plots again. It’s one of those tools that immediately makes you wonder how you ever lived without it.</p> <h1 id="i-feel-the-need-the-need-for-speed">I feel the need, the need for speed</h1> <p>So far, I’ve focused on ways to keep your codebase and experiments organized—making it easier to run, track, and reproduce results. But once that’s in place, it’s worth considering tweaks that can actually speed up each individual experiment, especially when training times start to add up.</p> <h2 id="newbie-gains-with-cupy">Newbie gains with cupy</h2> <p>If you find yourself writing a lot of numpy code for data wrangling or preprocessing, it’s worth knowing that numpy itself is strictly CPU-bound. For larger arrays or more intensive computations, this can become a bottleneck. CuPy is a drop-in replacement for numpy that runs operations on NVIDIA GPUs, often requiring only a change from <code class="language-plaintext highlighter-rouge">import numpy as np</code> to <code class="language-plaintext highlighter-rouge">import cupy as cp</code>.</p> <p>For example:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="n">cupy</span> <span class="k">as</span> <span class="n">cp</span>

<span class="c1"># Allocate arrays on the GPU
</span><span class="n">x</span> <span class="o">=</span> <span class="n">cp</span><span class="p">.</span><span class="nf">arange</span><span class="p">(</span><span class="mi">1000000</span><span class="p">)</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">cp</span><span class="p">.</span><span class="nf">sin</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>

<span class="c1"># Operations are performed on the GPU
</span><span class="n">result</span> <span class="o">=</span> <span class="n">cp</span><span class="p">.</span><span class="nf">sum</span><span class="p">(</span><span class="n">y</span><span class="p">)</span>
</code></pre></div></div> <p>Most common numpy functions are supported, and the syntax is nearly identical. The main caveat is that you’ll need to move data between CPU and GPU explicitly, and not every numpy feature is available. But for heavy array computations, switching to CuPy can save a surprising amount of time compared to pure numpy.</p> <div class="row mt-1"> <div class="col-sm mt-2 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/considerations/cupy_speedups.webp" sizes="95vw"/> <img src="/assets/img/considerations/cupy_speedups.webp" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Source: https://medium.com/rapids-ai/single-gpu-cupy-speedups-ea99cbbb0cbb </div> </div> </div> <h2 id="compile-can-be-your-friend">Compile can be your friend</h2> <p><code class="language-plaintext highlighter-rouge">torch.compile()</code> is one of those features that’s worth trying out. The idea is simple: you wrap your model (or even just a function) with <code class="language-plaintext highlighter-rouge">torch.compile()</code>, and PyTorch will try to optimize it under the hood—things like kernel fusion, better graph execution, and other tricks that can speed up training and inference. You don’t need to change your code structure or rewrite your model; it’s meant to be a drop-in improvement.</p> <p>Here’s a minimal example:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="n">torch</span>
<span class="kn">import</span> <span class="n">torch.nn</span> <span class="k">as</span> <span class="n">nn</span>

<span class="n">model</span> <span class="o">=</span> <span class="n">nn</span><span class="p">.</span><span class="nc">Sequential</span><span class="p">(</span>
    <span class="n">nn</span><span class="p">.</span><span class="nc">Linear</span><span class="p">(</span><span class="mi">128</span><span class="p">,</span> <span class="mi">256</span><span class="p">),</span>
    <span class="n">nn</span><span class="p">.</span><span class="nc">ReLU</span><span class="p">(),</span>
    <span class="n">nn</span><span class="p">.</span><span class="nc">Linear</span><span class="p">(</span><span class="mi">256</span><span class="p">,</span> <span class="mi">10</span><span class="p">)</span>
<span class="p">)</span>

<span class="c1"># Just wrap your model
</span><span class="n">compiled_model</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="nf">compile</span><span class="p">(</span><span class="n">model</span><span class="p">)</span>

<span class="c1"># Use as usual
</span><span class="k">for</span> <span class="n">data</span><span class="p">,</span> <span class="n">target</span> <span class="ow">in</span> <span class="n">loader</span><span class="p">:</span>
    <span class="n">output</span> <span class="o">=</span> <span class="nf">compiled_model</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
    <span class="n">loss</span> <span class="o">=</span> <span class="nf">criterion</span><span class="p">(</span><span class="n">output</span><span class="p">,</span> <span class="n">target</span><span class="p">)</span>
    <span class="c1"># ...existing training code...
</span></code></pre></div></div> <p>You can also compile arbitrary functions, not just models:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">custom_forward</span><span class="p">(</span><span class="n">x</span><span class="p">):</span>
    <span class="c1"># ...some tensor ops...
</span>    <span class="k">return</span> <span class="n">x</span> <span class="o">*</span> <span class="mi">2</span> <span class="o">+</span> <span class="n">torch</span><span class="p">.</span><span class="nf">sin</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>

<span class="n">compiled_fn</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="nf">compile</span><span class="p">(</span><span class="n">custom_forward</span><span class="p">)</span>
<span class="n">result</span> <span class="o">=</span> <span class="nf">compiled_fn</span><span class="p">(</span><span class="n">torch</span><span class="p">.</span><span class="nf">randn</span><span class="p">(</span><span class="mi">32</span><span class="p">,</span> <span class="mi">128</span><span class="p">))</span>
</code></pre></div></div> <p>The main tradeoff is that the first time you run a compiled model or function, it will be noticeably slower—PyTorch is tracing and optimizing the computation graph. For workloads where you only run a few batches, this overhead isn’t worth it. But if you’re training/evaluating for multiple iterations/batches, the initial cost gets amortized, and you can see real speedups (sometimes 20-30% or more, depending on the model and hardware).</p> <h2 id="async-transfers">Async transfers</h2> <p>You’ve probably noticed that your GPU utilization sometimes hovers around 70-80% instead of the near-100% you’d expect, even when your batch size and model complexity seem reasonable. The hidden culprit is often data transfer time between CPU and GPU—every <code class="language-plaintext highlighter-rouge">.to(device)</code> call is a synchronization point by default, meaning your expensive GPU sits idle waiting for data to crawl over the PCIe bus.</p> <p>The easiest win is enabling pinned memory in your DataLoader, which uses page-locked host memory for much faster transfers:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Simple change with immediate benefits
</span><span class="n">train_loader</span> <span class="o">=</span> <span class="nc">DataLoader</span><span class="p">(</span>
    <span class="n">dataset</span><span class="p">,</span> 
    <span class="n">batch_size</span><span class="o">=</span><span class="mi">32</span><span class="p">,</span> 
    <span class="n">pin_memory</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span>  <span class="c1"># This alone can give 20-30% speedup
</span>    <span class="n">num_workers</span><span class="o">=</span><span class="mi">4</span>
<span class="p">)</span>

<span class="c1"># Now use non-blocking transfers
</span><span class="k">for</span> <span class="n">data</span><span class="p">,</span> <span class="n">target</span> <span class="ow">in</span> <span class="n">train_loader</span><span class="p">:</span>
    <span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="p">.</span><span class="nf">to</span><span class="p">(</span><span class="n">device</span><span class="p">,</span> <span class="n">non_blocking</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
    <span class="n">target</span> <span class="o">=</span> <span class="n">target</span><span class="p">.</span><span class="nf">to</span><span class="p">(</span><span class="n">device</span><span class="p">,</span> <span class="n">non_blocking</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
    
    <span class="n">output</span> <span class="o">=</span> <span class="nf">model</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
    <span class="n">loss</span> <span class="o">=</span> <span class="nf">criterion</span><span class="p">(</span><span class="n">output</span><span class="p">,</span> <span class="n">target</span><span class="p">)</span>
</code></pre></div></div> <p>The real benefit comes when you can overlap transfers with computation:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># X is large, Y is small
</span><span class="n">x</span> <span class="o">=</span> <span class="n">large_tensor</span><span class="p">.</span><span class="nf">pin_memory</span><span class="p">()</span>  <span class="c1"># e.g., batch of images
</span><span class="n">y</span> <span class="o">=</span> <span class="n">small_tensor</span><span class="p">.</span><span class="nf">pin_memory</span><span class="p">()</span>  <span class="c1"># e.g., single image or metadata
</span>
<span class="c1"># Start transferring the large tensor asynchronously
</span><span class="n">x_gpu</span> <span class="o">=</span> <span class="n">x</span><span class="p">.</span><span class="nf">cuda</span><span class="p">(</span><span class="n">non_blocking</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>

<span class="c1"># While X is transferring, process Y
</span><span class="n">y_gpu</span> <span class="o">=</span> <span class="n">y</span><span class="p">.</span><span class="nf">cuda</span><span class="p">()</span>  <span class="c1"># Small, transfers quickly
</span><span class="n">output_y</span> <span class="o">=</span> <span class="nf">model2</span><span class="p">(</span><span class="n">y_gpu</span><span class="p">)</span>

<span class="c1"># By now X should be ready on GPU
</span><span class="n">output_x</span> <span class="o">=</span> <span class="nf">model</span><span class="p">(</span><span class="n">x_gpu</span><span class="p">)</span>
</code></pre></div></div> <p>The key insight is using the time it takes to transfer large data to do other useful work—processing smaller tensors, running computations, or preparing the next batch<d-footnote>The PyTorch tutorial on pinned memory has more details on the underlying mechanics: https://docs.pytorch.org/tutorials/intermediate/pinmem_nonblock.html</d-footnote>.</p> <p>Async transfers only help when the next operation doesn’t immediately depend on the transferred data. If you call <code class="language-plaintext highlighter-rouge">model(data)</code> right after <code class="language-plaintext highlighter-rouge">.to(device, non_blocking=True)</code>, PyTorch will still wait for the transfer to complete before starting the forward pass.</p> <p>The real gotcha comes when transferring data back to CPU, especially with explicit async calls:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">save_predictions</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">dataloader</span><span class="p">):</span>
    <span class="n">predictions</span> <span class="o">=</span> <span class="p">[]</span>
    
    <span class="k">with</span> <span class="n">torch</span><span class="p">.</span><span class="nf">no_grad</span><span class="p">():</span>
        <span class="k">for</span> <span class="n">data</span><span class="p">,</span> <span class="n">target</span> <span class="ow">in</span> <span class="n">dataloader</span><span class="p">:</span>
            <span class="n">output</span> <span class="o">=</span> <span class="nf">model</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="nf">to</span><span class="p">(</span><span class="n">device</span><span class="p">))</span>
            <span class="n">pred</span> <span class="o">=</span> <span class="n">output</span><span class="p">.</span><span class="nf">argmax</span><span class="p">(</span><span class="n">dim</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
            
            <span class="c1"># If you use non_blocking=True here, this becomes dangerous:
</span>            <span class="n">pred_cpu</span> <span class="o">=</span> <span class="n">pred</span><span class="p">.</span><span class="nf">to</span><span class="p">(</span><span class="sh">'</span><span class="s">cpu</span><span class="sh">'</span><span class="p">,</span> <span class="n">non_blocking</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
            
            <span class="c1"># BUG: numpy() might execute before transfer completes!
</span>            <span class="n">predictions</span><span class="p">.</span><span class="nf">extend</span><span class="p">(</span><span class="n">pred_cpu</span><span class="p">.</span><span class="nf">numpy</span><span class="p">())</span>  <span class="c1"># Potential garbage data
</span></code></pre></div></div> <div class="alert alert-danger" role="alert"> The issue arises because when you explicitly use `non_blocking=True` for GPU→CPU transfers, the CPU doesn't wait for the transfer to complete. Accessing the data (like with `.numpy()`) before the transfer finishes gives you garbage. The fixes are straightforward: </div> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Option 1: Don't use non_blocking for GPU→CPU (default behavior)
</span><span class="n">pred_cpu</span> <span class="o">=</span> <span class="n">pred</span><span class="p">.</span><span class="nf">cpu</span><span class="p">()</span>  <span class="c1"># Synchronous by default
</span><span class="n">predictions</span><span class="p">.</span><span class="nf">extend</span><span class="p">(</span><span class="n">pred_cpu</span><span class="p">.</span><span class="nf">numpy</span><span class="p">())</span>

<span class="c1"># Option 2: If you do use non_blocking, explicitly synchronize
</span><span class="n">pred_cpu</span> <span class="o">=</span> <span class="n">pred</span><span class="p">.</span><span class="nf">to</span><span class="p">(</span><span class="sh">'</span><span class="s">cpu</span><span class="sh">'</span><span class="p">,</span> <span class="n">non_blocking</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
<span class="n">torch</span><span class="p">.</span><span class="n">cuda</span><span class="p">.</span><span class="nf">synchronize</span><span class="p">()</span>  <span class="c1"># Wait for all GPU operations to complete
</span><span class="n">predictions</span><span class="p">.</span><span class="nf">extend</span><span class="p">(</span><span class="n">pred_cpu</span><span class="p">.</span><span class="nf">numpy</span><span class="p">())</span>

<span class="c1"># Option 3: Accumulate on GPU, transfer once at the end
</span><span class="n">all_preds</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="nf">cat</span><span class="p">(</span><span class="n">gpu_predictions</span><span class="p">,</span> <span class="n">dim</span><span class="o">=</span><span class="mi">0</span><span class="p">).</span><span class="nf">cpu</span><span class="p">().</span><span class="nf">numpy</span><span class="p">()</span>
</code></pre></div></div> <p>The key insight is that async transfers shine when you can overlap them with computation that doesn’t depend on the transferred data. Combined with pinned memory, this can substantially improve throughput for data-heavy workloads.</p> <h2 id="one-batch-two-batch-penny-and-dime">One batch, two batch, penny and dime</h2> <p>When it comes to inference, there’s rarely a good reason not to push your GPU memory usage as much as possible. The ideal, principled approach is to calculate the maximum batch size your model and script can support, given the memory constraints. In practice, though, many tend to be a bit lazy here—usually starting with a conservative batch size and gradually increasing it until I hit an OOM error.</p> <p>A good sweet spot is to try and empirically infer the relationship between batch size and GPU memory consumption for your specific setup. This helps avoid surprises, especially when switching models or datasets. If you want to get a sense of your memory usage patterns, I’ve found it useful to track GPU memory throughout the experiment. I wrote a <a href="https://gist.github.com/iamgroot42/5f1f33e39621e545c621e90472b649d3">barebones utility script</a>) that monitors <code class="language-plaintext highlighter-rouge">nvidia-smi</code> during your run and summarizes memory usage at the end. This makes it easy to spot the peak usage, debug unexpected spikes, or decide if you need to adjust batch sizes for certain inputs (e.g., truncate long sequences, partition batches for variable-length data).</p> <h1 id="slurm-slurm-peralta">SLURM SLURM, Peralta</h1> <p>If you have access to a SLURM cluster, you’re sitting on a goldmine for running ML experiments—but most people use it like an overpowered SSH session. Instead of thinking “how do I run this one experiment on SLURM?”, start thinking “how do I run all my experiments efficiently?”</p> <p>Here’s what the inefficient approach looks like. You want to test your new membership inference attack:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sbatch experiment_1.slurm
<span class="c"># Wait... check results... then:</span>
sbatch experiment_1.slurm
<span class="c"># Wait... check results... then:</span>
sbatch experiment_1.slurm
<span class="c"># And so on...</span>
</code></pre></div></div> <p>There is no reason to submit jobs only when previous ones finish- in the absolute worst case (SLURM is extra busy, your jobs have very low priority in the queue), your jobs may actually end up running one after the other but in the average/best case, they will all run in parallel. tl;dr let the SLURM scheduler worry about scheduling the jobs- just submit them all at once!</p> <p>One thing that is especially helpful here is job arrays—the feature that transforms SLURM from a glorified remote desktop into a proper experiment manager:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># One command to rule them all</span>
sbatch <span class="nt">--array</span><span class="o">=</span>0-5 run_experiment.slurm
</code></pre></div></div> <p>This single command launches 6 jobs simultaneously (indices 0 through 5), each with a unique <code class="language-plaintext highlighter-rouge">SLURM_ARRAY_TASK_ID</code> that your script can use to determine which specific experiment to run. Inside your <code class="language-plaintext highlighter-rouge">run_experiment.slurm</code>, you map the task ID to experimental parameters:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/bash</span>
<span class="c">#SBATCH --ntasks=1</span>
<span class="c">#SBATCH --mem=32G</span>
<span class="c">#SBATCH --gres=gpu:1</span>
<span class="c">#SBATCH --time=2:00:00</span>
<span class="c">#SBATCH --output=logs/exp_%A_%a.out</span>
<span class="c">#SBATCH --error=logs/exp_%A_%a.err</span>

<span class="c"># Define your experimental grid</span>
<span class="nv">MODELS</span><span class="o">=(</span>resnet18 resnet50 vgg16<span class="o">)</span>
<span class="nv">DATASETS</span><span class="o">=(</span>cifar10 imagenet<span class="o">)</span>

<span class="c"># Calculate which model and dataset to use</span>
<span class="nv">MODEL_IDX</span><span class="o">=</span><span class="k">$((</span>SLURM_ARRAY_TASK_ID <span class="o">/</span> <span class="m">2</span><span class="k">))</span>
<span class="nv">DATASET_IDX</span><span class="o">=</span><span class="k">$((</span>SLURM_ARRAY_TASK_ID <span class="o">%</span> <span class="m">2</span><span class="k">))</span>

<span class="nv">MODEL</span><span class="o">=</span><span class="k">${</span><span class="nv">MODELS</span><span class="p">[</span><span class="nv">$MODEL_IDX</span><span class="p">]</span><span class="k">}</span>
<span class="nv">DATASET</span><span class="o">=</span><span class="k">${</span><span class="nv">DATASETS</span><span class="p">[</span><span class="nv">$DATASET_IDX</span><span class="p">]</span><span class="k">}</span>

<span class="nb">echo</span> <span class="s2">"Running experiment: </span><span class="nv">$MODEL</span><span class="s2"> on </span><span class="nv">$DATASET</span><span class="s2">"</span>
python run_mia_attack.py <span class="nt">--model</span> <span class="nv">$MODEL</span> <span class="nt">--dataset</span> <span class="nv">$DATASET</span>
</code></pre></div></div> <p>The <code class="language-plaintext highlighter-rouge">%A_%a</code> in the output files gives you the job array ID and task ID, so you get separate log files like <code class="language-plaintext highlighter-rouge">exp_12345_0.out</code>, <code class="language-plaintext highlighter-rouge">exp_12345_1.out</code>, etc. This makes debugging individual runs much easier than having everything mixed together.</p> <p>But job arrays aren’t just for hyperparameter sweeps. I use them for:</p> <ul> <li><strong>Testing different baselines</strong>: Run your method against 10 different attack baselines simultaneously</li> <li><strong>Cross-dataset evaluation</strong>: Evaluate the same model on multiple datasets</li> <li><strong>Ablation studies</strong>: Test different components of your method (with/without data augmentation, different loss functions, etc.)</li> <li><strong>Robustness testing</strong>: Same experiment with different random seeds to check variance</li> </ul> <p>The real power comes when you need to run many variations. Want to test 5 models × 3 datasets × 4 random seeds = 60 experiments? Instead of submitting jobs one by one over several days, you submit one array job and walk away:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sbatch <span class="nt">--array</span><span class="o">=</span>0-59 comprehensive_eval.slurm
</code></pre></div></div> <p>Your script maps the 60 task IDs to the appropriate combinations:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">MODELS</span><span class="o">=(</span>resnet18 resnet50 vgg16 densenet alexnet<span class="o">)</span>
<span class="nv">DATASETS</span><span class="o">=(</span>cifar10 cifar100 imagenet<span class="o">)</span>
<span class="nv">SEEDS</span><span class="o">=(</span>42 123 456 789<span class="o">)</span>

<span class="c"># Extract indices from SLURM_ARRAY_TASK_ID</span>
<span class="nv">SEED_IDX</span><span class="o">=</span><span class="k">$((</span>SLURM_ARRAY_TASK_ID <span class="o">%</span> <span class="m">4</span><span class="k">))</span>
<span class="nv">DATASET_IDX</span><span class="o">=</span><span class="k">$((</span><span class="o">(</span>SLURM_ARRAY_TASK_ID <span class="o">/</span> <span class="m">4</span><span class="o">)</span> <span class="o">%</span> <span class="m">3</span><span class="k">))</span>
<span class="nv">MODEL_IDX</span><span class="o">=</span><span class="k">$((</span>SLURM_ARRAY_TASK_ID <span class="o">/</span> <span class="m">12</span><span class="k">))</span>

<span class="nv">MODEL</span><span class="o">=</span><span class="k">${</span><span class="nv">MODELS</span><span class="p">[</span><span class="nv">$MODEL_IDX</span><span class="p">]</span><span class="k">}</span>
<span class="nv">DATASET</span><span class="o">=</span><span class="k">${</span><span class="nv">DATASETS</span><span class="p">[</span><span class="nv">$DATASET_IDX</span><span class="p">]</span><span class="k">}</span>
<span class="nv">SEED</span><span class="o">=</span><span class="k">${</span><span class="nv">SEEDS</span><span class="p">[</span><span class="nv">$SEED_IDX</span><span class="p">]</span><span class="k">}</span>
</code></pre></div></div> <p>A few practical tips that have saved me headaches:</p> <p><strong>Resource sizing</strong>: Don’t request more resources than you need. If your job only uses 8GB of memory, don’t request 64GB—you’ll wait longer in the queue and waste allocation budget. I usually run a few experiments locally first to get a rough estimate of memory and runtime requirements.</p> <p><strong>Smart array sizing</strong>: Instead of submitting massive arrays (like <code class="language-plaintext highlighter-rouge">--array=0-999</code>), consider breaking them into smaller chunks (<code class="language-plaintext highlighter-rouge">--array=0-99</code>, then <code class="language-plaintext highlighter-rouge">--array=100-199</code>, etc.). This gives you more flexibility if you need to cancel some jobs or if you discover an issue with your setup early on.</p> <p><strong>Checkpoint your work</strong>: For longer experiments, save intermediate results. SLURM has time limits, and there’s nothing worse than losing 8 hours of training because your job hit the wall time. A simple checkpoint every epoch can save you from starting over.</p> <p>As I mentioned in my <a href="https://www.anshumansuri.com/blog/2022/uva-rivanna/">earlier post</a> about SLURM, there are plenty of other useful features and cluster-specific quirks to learn. But mastering job arrays alone will transform how you approach large-scale experimentation.</p> <h1 id="takeaways">Takeaways</h1> <p>Most of this post is just a collection of practical habits and tools that have made my ML workflow less painful and more reproducible. If you have other tricks or approaches that work well for you, I’d be interested to hear about them—feel free to reach out or contribute to the post <a href="https://github.com/iamgroot42/personal-website-2">directly with a PR</a>!</p>]]></content><author><name>Anshuman Suri</name></author><category term="guide"/><summary type="html"><![CDATA[Lessons and recommendations based on my experiences working on ML projects.]]></summary></entry><entry><title type="html">Reassessing EMNLP 2024’s Best Paper: Does Divergence-Based Calibration for Membership Inference Attacks Hold Up?</title><link href="https://anshumansuri.com/blog/2024/calibrated-mia/" rel="alternate" type="text/html" title="Reassessing EMNLP 2024’s Best Paper: Does Divergence-Based Calibration for Membership Inference Attacks Hold Up?"/><published>2024-11-26T00:00:00+00:00</published><updated>2024-11-26T00:00:00+00:00</updated><id>https://anshumansuri.com/blog/2024/calibrated-mia</id><content type="html" xml:base="https://anshumansuri.com/blog/2024/calibrated-mia/"><![CDATA[<h2 id="introduction">Introduction</h2> <p>At EMNLP 2024, the <a href="https://x.com/emnlpmeeting/status/1857176180128198695/photo/1">Best Paper Award</a> was given to <strong>“Pretraining Data Detection for Large Language Models: A Divergence-based Calibration Method”</strong><d-cite key="zhang2024pretraining"></d-cite>. The paper addresses Membership Inference Attacks (MIAs), a key issue in machine learning related to privacy. The authors propose a new calibration method and introduce <strong>PatentMIA</strong>, a benchmark utilizing temporally shifted patent data to validate their approach. The method recalibrates model probabilities using a divergence metric between the outputs of a target model and a token-frequency map (basically a histogram) derived from auxiliary data, claiming improved detection of member and non-member samples.</p> <p>However, upon closer examination, we identified significant shortcomings in both the experimental design and evaluation methodology. The proposed dataset introduces a temporal shift between the distribution of member and non-member data, which can lead to overestimation of the performance of an MIA that may end up distinguishing samples based on the temporal range, and not actual membership.</p> <p>In this post, we critically analyze this shift, and the broader implications of MIA evaluations for models in the wild.</p> <h2 id="what-is-membership-inference">What is Membership Inference?</h2> <p>Membership Inference Attacks (MIAs) are a useful tool in assessing memorization of training data by a model trained on it. Given a model \(D\) samples from some underlying distribution \(\mathcal{D}\) and a model \(M\) trained on \(D\), membership inference <d-cite key="yeom2018privacy"></d-cite> asks the following question:</p> <blockquote> <p>Was some given record \(x\) part of the training dataset \(D\), or just the overall distribution \(\mathcal{D}\)?</p> </blockquote> <p>The underlying distribution \(\mathcal{D}\) is assumed to be large enough to the point where the above test can be reframed as inferring whether \(x \in D\) (via access to \(M\)) or not. In practice, the adversary/auditor starts with some non-member data (data that they know was not part of the training data \(D\), but belongs to the same underlying distribution \(\mathcal{D}\)) and on the basis of some scoring function, generates a distribution of scores for these non-members. A sweep over these values can then yield “thresholds” corresponding to certain false-positive rates (FPRs), which can then be used to evaluate the true-positive rate (TPR) of the approach under consideration.</p> <p>It is important to note here that these non-members should be from the <strong>same</strong> underlying distribution. To better understand why this is important, think of a model trained for the binary classification task of distinguishing images of squirrels and groundhogs <d-footnote>Maybe you want to give nuts to squirrels and vegetables to groundhogs </d-footnote>. For this example, let’s say this particular groundhog image was part of the training data, but the other two weren’t.</p> <div class="row mt-3"> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/calibrated-mia/groundhog.avif" sizes="95vw"/> <img src="/assets/img/calibrated-mia/groundhog.avif" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/calibrated-mia/squirrel-480.webp 480w,/assets/img/calibrated-mia/squirrel-800.webp 800w,/assets/img/calibrated-mia/squirrel-1400.webp 1400w," type="image/webp" sizes="95vw"/> <img src="/assets/img/calibrated-mia/squirrel.jpg" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/calibrated-mia/llama.webp" sizes="95vw"/> <img src="/assets/img/calibrated-mia/llama.webp" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> </div> <p>A model will have higher loss on images of llamas, and understandably so since these are images the model did not see at all during training. Using their images would give a clear member/non-member distinction, but would also probably classify <em>any</em> squirrel/groundhog image as a member, even if it wasn’t. As an experimental setup, this is easily enforced when working with standard machine learning models and datasets such as CIFAR-10 and ImageNet, where well-established train/test splits from the same underlying distribution exist.</p> <h3 id="whats-special-about-llms">What’s Special about LLMs?</h3> <p>Because these models are trained on a large scale of data (and in many cases, exact training data is unknown), it is hard to collect data to use as “non-members” which has not been used in the model training <strong>and</strong> is from the same underlying distribution. Early works on membership inference for LLMs resorted to using data generated after a model’s training cutoff <d-cite key="shi2023detecting"></d-cite>, since such data could not have been seen by a model. However, such design choices can introduce implicit distribution shifts <d-cite key="das2024blind,duan2024membership,maini2024llm,meeus2024sok"></d-cite> and give a false sense of membership leakage.</p> <h2 id="method-overview">Method Overview</h2> <p>The proposed method tries to fix a known issue with MIAs: models often fail to properly separate member and non-member samples. To address this, the authors use an auxiliary data-source to compute token-level frequencies, which are then used to recalibrate token-wise model logits. This normalization aims to adjust token-level model probabilities based on their natural frequency or rarity, aligning with membership inference practices such as reference model calibration<d-cite key="carlini2022membership"></d-cite>.</p> <p>They also introduce <strong>PatentMIA</strong>, a benchmark that uses temporally shifted patents as data. The idea is to test whether the model can identify if a patent document was part of its training data or not. While this approach sounds interesting, our experiments suggest that the reported results are influenced by limitations in the benchmark design.</p> <h2 id="experimental-evaluation">Experimental Evaluation</h2> <p>We ran two key experiments to test the paper’s claims: one for true positives and another for false positives.</p> <h3 id="true-positive-rate-experiment">True Positive Rate Experiment</h3> <p>This experiment checks if the method can correctly distinguish member data from non-member data when both are drawn from the <strong>same distribution</strong>. We used train and validation splits from <strong>The Pile</strong> dataset, which ensures there are no temporal or distributional differences between the two sets. Below we report results for the <em>Wikipedia</em> split.</p> <table> <thead> <tr> <th style="text-align: left">Model</th> <th style="text-align: center">AUC</th> <th style="text-align: right">TPR@5%FPR</th> </tr> </thead> <tbody> <tr> <td style="text-align: left"><a href="https://huggingface.co/EleutherAI/pythia-6.9b">Pythia-6.9B</a></td> <td style="text-align: center">0.542</td> <td style="text-align: right">0.071</td> </tr> <tr> <td style="text-align: left"><a href="https://huggingface.co/EleutherAI/gpt-neo-125m">GPT-Neo-125M</a></td> <td style="text-align: center">0.492</td> <td style="text-align: right">0.054</td> </tr> <tr> <td style="text-align: left"><a href="https://huggingface.co/EleutherAI/gpt-neox-20b">GPT-NeoX-20B</a></td> <td style="text-align: center">0.600</td> <td style="text-align: right">0.103</td> </tr> </tbody> </table> <p><strong>Result:</strong><br/> The method performs only slightly better than the LOSS attack, and remains comparable to most standalone membership inference attacks. For reference, AUC with the baseline LOSS and zlib <d-cite key="carlini2021extracting"></d-cite> attacks for Pythia-6.9B are 0.526 and 0.536 respectively, while it is 0.618 when using a reference-model (Table 12 in <d-cite key="duan2024membership"></d-cite>). Similarly, using LOSS and zlib yield AUCs of 0.563 and 0.572 respectively.</p> <p>Reported improvements in the paper (Table 2 <d-cite key="zhang2024pretraining"></d-cite> showing AUCs of 0.7 and higher) are thus <u>likely due to exploiting differences in the data distribution</u>, rather than actual improvements in detecting membership.</p> <h3 id="false-positive-rate-experiment">False Positive Rate Experiment</h3> <p>Next, we check how often the method falsely identifies data as “member” when it has in fact not be used in the model’s training. To do this, we use the <strong>WikiMIA</strong><d-cite key="shi2023detecting"></d-cite> dataset but replaced the training data with unrelated validation data from the <em>Wikipedia</em> split of <strong>The Pile</strong>. This means that we can say with certainty that the Pythia and GPT-neox models did not train on either split. We follow the experimental setup of in Section 3 of <d-cite key="maini2024llm"></d-cite> for this analysis.</p> <p><strong>Result:</strong><br/> Below we report results for the <em>Wikipedia</em> split. Note that in this setting, a score closer to 0.5 is better since both splits are non-members.</p> <table> <thead> <tr> <th style="text-align: left">Model</th> <th style="text-align: center">AUC for DC-PDD <d-cite key="zhang2024pretraining"></d-cite></th> <th style="text-align: right">AUC for LOSS <d-cite key="carlini2021extracting"></d-cite></th> </tr> </thead> <tbody> <tr> <td style="text-align: left"><a href="https://huggingface.co/EleutherAI/pythia-6.9b">Pythia-6.9B</a></td> <td style="text-align: center">0.667</td> <td style="text-align: right">0.636</td> </tr> <tr> <td style="text-align: left"><a href="https://huggingface.co/EleutherAI/gpt-neo-125m">GPT-Neo-125M</a></td> <td style="text-align: center">0.689</td> <td style="text-align: right">0.671</td> </tr> <tr> <td style="text-align: left"><a href="https://huggingface.co/EleutherAI/gpt-neox-20b">GPT-Neox-20b</a></td> <td style="text-align: center">0.637</td> <td style="text-align: right">0.656</td> </tr> </tbody> </table> <p>The method flags a high number of false positives. It frequently identifies non-member data as part of the training set, suggesting that the attack was was reliant on temporal or distribution artifacts rather than truly detecting membership.</p> <h2 id="the-problem-with-temporally-shifted-benchmarks">The Problem with Temporally Shifted Benchmarks</h2> <p>The introduction of <strong>PatentMIA</strong> highlights a broader problem with MIA research: benchmarks that rely on temporal shifts <d-cite key="meeus2024did,shi2023detecting,dubinski2024towards,ko2023practical"></d-cite>. These benchmarks often make it easy for attack models to exploit simple artifacts, like whether a document contains terms that didn’t exist during training (e.g., “COVID-19” or “Tesla Model Y”). This creates an illusion of success but doesn’t address the real challenge of membership inference.</p> <h3 id="why-these-benchmarks-are-misleading">Why These Benchmarks Are Misleading</h3> <p>The issues with temporally shifted benchmarks are not new. Several prior works have already established the dangers of using such benchmarks:</p> <ol> <li><strong>Spurious Patterns</strong>: Temporal shifts introduce artifacts that are easily exploitable by attack models. As noted by Duan et al. <d-cite key="duan2024membership"></d-cite>, temporal markers (e.g., “COVID-19” or recent events) allow models to cheat by detecting new concepts rather than true membership.</li> <li><strong>Misleading Evaluations</strong>: Maini et al. <d-cite key="maini2024llm"></d-cite> show how temporal shifts can inflate the perceived success of MIAs, even when no meaningful membership inference occurs.</li> <li><strong>Blind Baselines Work Better</strong>: Das et al. <d-cite key="das2024blind"></d-cite> demonstrate that blind baselines often outperform sophisticated MIAs on temporally shifted datasets, highlighting how these benchmarks fail to test real inference ability.</li> </ol> <p>Despite these well-established issues, the EMNLP Best Paper continues to rely on temporally shifted data like <strong>PatentMIA</strong> for its evaluations. This undermines the robustness of its claims and contributes little to advancing membership inference research.</p> <hr/> <h2 id="machine-learning-awards-a-problem-of-incentives">Machine Learning Awards: A Problem of Incentives</h2> <p>This situation raises important questions about the role of awards in machine learning research.</p> <ol> <li><strong>Do Awards Encourage Rushed Work?</strong> Highlighting work with known flaws, like relying on misleading benchmarks, can discourage researchers from investing time in more rigorous evaluations.</li> <li><strong>Harming the Field</strong>: Awards that celebrate flawed work set a bad precedent and can mislead the community into thinking these methods are the gold standard.</li> <li><strong>Losing Credibility</strong>: Over time, the reputation of awards themselves suffers, as researchers may start viewing them as less meaningful.</li> </ol> <p>This is a growing problem in machine learning research, where not only acceptance but even awards are constantly under <a href="https://www.reddit.com/r/MachineLearning/comments/w4ooph/d_icml_2022_outstanding_paper_awards/">scrutiny</a> for their <a href="https://parameterfree.com/2023/08/30/yet-another-icml-award-fiasco/">soundness</a>, let alone their contribution. If awards are to truly highlight excellence, they must emphasize thoroughness, reproducibility, and robustness over surface-level novelty.</p> <h2 id="conclusion">Conclusion</h2> <p>The EMNLP 2024 Best Paper sought to address a pressing challenge in membership inference but falls short under careful scrutiny. The proposed method fails both in distinguishing members and non-members under rigorous conditions and in avoiding false positives when the data is untrained. Furthermore, its reliance on <strong>PatentMIA</strong> exemplifies a larger issue with using temporally shifted benchmarks to claim progress.</p> <p>For the field to advance meaningfully, greater emphasis must be placed on rigorous evaluation practices. Awards should reflect this by rewarding work with robust and thorough evaluations, rather than methods that (knowingly or otherwise) exploit well-known flaws in evaluation practices. Only then can we ensure that the field moves forward in a meaningful way.</p> <h4 id="acknowledgements">Acknowledgements</h4> <p>We would like to thank <a href="https://www.zacharylipton.com/">Zack Lipton</a> and <a href="https://zicokolter.com/">Zico Kolter</a> for their helpful feedback on the draft and for referring us to Nicholas’s <d-cite key="carlini2019ami"></d-cite> example of good criticism.</p>]]></content><author><name>Pratyush Maini</name></author><category term="exploration"/><summary type="html"><![CDATA[TL;DR: No. A critical analysis of the EMNLP Best Paper proposing a divergence-based calibration for Membership Inference Attacks (MIAs). We explore its experimental shortcomings, issues with temporally shifted benchmarks, and what this means for machine learning awards.]]></summary></entry><entry><title type="html">My submission to the ETI Challenge</title><link href="https://anshumansuri.com/blog/2024/eti-submission/" rel="alternate" type="text/html" title="My submission to the ETI Challenge"/><published>2024-11-12T00:00:00+00:00</published><updated>2024-11-12T00:00:00+00:00</updated><id>https://anshumansuri.com/blog/2024/eti-submission</id><content type="html" xml:base="https://anshumansuri.com/blog/2024/eti-submission/"><![CDATA[<p>This post describes an approach developed for the <a href="https://erasinginvisible.github.io/">Erasing the Invisible</a> challenge at NeurIPS 2024. My method combined “rinsing” with adversarial techniques, designed for both the black-box and beige-box competition tracks. Although my solution didn’t secure a top spot, I saw potential in the methodology and wanted to document it to possibly aid future research and development in this area.</p> <h2 id="adversarial-rinsing">Adversarial Rinsing</h2> <p>The central idea behind my approach is blending “rinsing”<d-cite key="an2024waves"></d-cite> with adversarial perturbations. “Rinsing” here means passing an image through a diffusion model multiple times, intending to erode watermarks present in the input. For adversarial examples, I used the SMI$^2$FGSM<d-cite key="wang2022enhancing"></d-cite> attack because of its success with transfer-based attacks<d-cite key="suya2024sok"></d-cite>. The objective of these adversarial perturbations is to disrupt the latent space representation of the image, aiming to dislodge any potential latent-space watermarks.</p> <h3 id="generating-adversarial-perturbations">Generating Adversarial Perturbations</h3> <p>I used a joint loss that maximizes the separation of the perturbed image’s embedding from the original in two ways:</p> <ul> <li><strong>Embedding Space Distance</strong>: A loss that combines norm-distance and cosine-distance for better embedding separation.</li> </ul> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">MSEandCosine</span><span class="p">(</span><span class="n">ch</span><span class="p">.</span><span class="n">nn</span><span class="p">.</span><span class="n">Module</span><span class="p">):</span>
  <span class="bp">...</span>
    <span class="k">def</span> <span class="nf">forward</span><span class="p">(</span><span class="n">self</span><span class="p">,</span> <span class="n">output</span><span class="p">,</span> <span class="n">target</span><span class="p">):</span>
        <span class="n">mse_loss</span> <span class="o">=</span> <span class="n">self</span><span class="p">.</span><span class="nf">mse</span><span class="p">(</span><span class="n">output</span><span class="p">,</span> <span class="n">target</span><span class="p">)</span>
        <span class="c1"># Flatten to compute cosine similarity
</span>        <span class="n">csn_loss</span> <span class="o">=</span> <span class="mi">1</span> <span class="o">-</span> <span class="n">self</span><span class="p">.</span><span class="nf">csn</span><span class="p">(</span><span class="n">output</span><span class="p">.</span><span class="nf">view</span><span class="p">(</span><span class="n">output</span><span class="p">.</span><span class="nf">size</span><span class="p">(</span><span class="mi">0</span><span class="p">),</span> <span class="o">-</span><span class="mi">1</span><span class="p">),</span> <span class="n">target</span><span class="p">.</span><span class="nf">view</span><span class="p">(</span><span class="n">target</span><span class="p">.</span><span class="nf">size</span><span class="p">(</span><span class="mi">0</span><span class="p">),</span> <span class="o">-</span><span class="mi">1</span><span class="p">))</span>

        <span class="c1"># Combined Loss
</span>        <span class="n">loss</span> <span class="o">=</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">self</span><span class="p">.</span><span class="n">alpha</span><span class="p">)</span> <span class="o">*</span> <span class="n">mse_loss</span> <span class="o">+</span> <span class="n">self</span><span class="p">.</span><span class="n">alpha</span> <span class="o">*</span> <span class="n">csn_loss</span>
        <span class="k">return</span> <span class="n">loss</span>
</code></pre></div></div> <ul> <li><strong>Image Quality Loss</strong>: This component uses differentiable metrics like PSNR, SSIM, LPIPS, aesthetics, and artifacts scores. By optimizing this loss, my aim was to preserve the original image quality while removing the watermark.</li> </ul> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">NormalizedImageQuality</span><span class="p">(</span><span class="n">ch</span><span class="p">.</span><span class="n">nn</span><span class="p">.</span><span class="n">Module</span><span class="p">):</span>
    <span class="bp">...</span>

    <span class="k">def</span> <span class="nf">forward</span><span class="p">(</span><span class="n">self</span><span class="p">,</span> <span class="n">output</span><span class="p">,</span> <span class="n">target</span><span class="p">):</span>
        <span class="sh">"""</span><span class="s">
            Output here is generated image, target is original image
        </span><span class="sh">"""</span>
        <span class="n">outputs_aesthetics</span><span class="p">,</span> <span class="n">outputs_artifacts</span> <span class="o">=</span> <span class="n">self</span><span class="p">.</span><span class="nf">_compute_aesthetics_and_artifacts_scores</span><span class="p">(</span><span class="n">output</span><span class="p">)</span>
        <span class="n">final_score</span> <span class="o">=</span> <span class="o">-</span><span class="mf">4.5e-2</span> <span class="o">*</span> <span class="n">outputs_aesthetics</span> <span class="o">+</span> <span class="mf">1.44e-1</span> <span class="o">*</span> <span class="n">outputs_artifacts</span>
        <span class="k">return</span> <span class="n">final_score</span>

        <span class="n">lpips_score</span> <span class="o">=</span> <span class="n">self</span><span class="p">.</span><span class="nf">_lpips</span><span class="p">(</span><span class="n">output</span><span class="p">,</span> <span class="n">target</span><span class="p">)</span>
        <span class="n">psnr_score</span> <span class="o">=</span> <span class="n">self</span><span class="p">.</span><span class="nf">_psnr</span><span class="p">(</span><span class="n">output</span><span class="p">,</span> <span class="n">target</span><span class="p">)</span>
        <span class="n">ssim_score</span> <span class="o">=</span> <span class="n">self</span><span class="p">.</span><span class="nf">_ssim</span><span class="p">(</span><span class="n">output</span><span class="p">,</span> <span class="n">target</span><span class="p">)</span>
        <span class="n">outputs_aesthetics</span><span class="p">,</span> <span class="n">outputs_artifacts</span> <span class="o">=</span> <span class="n">self</span><span class="p">.</span><span class="nf">_compute_aesthetics_and_artifacts_scores</span><span class="p">(</span><span class="n">output</span><span class="p">)</span>

        <span class="c1"># Differentiable NMI is too slow, so ignoring it for now
</span>        <span class="c1"># nmi_score = differentiable_nmi(output, target)
</span>
        <span class="k">if</span> <span class="n">self</span><span class="p">.</span><span class="n">target_aesthetics</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
            <span class="n">self</span><span class="p">.</span><span class="n">target_aesthetics</span><span class="p">,</span> <span class="n">self</span><span class="p">.</span><span class="n">target_artifacts</span> <span class="o">=</span> <span class="n">self</span><span class="p">.</span><span class="nf">_compute_aesthetics_and_artifacts_scores</span><span class="p">(</span><span class="n">target</span><span class="p">)</span>

        <span class="n">delta_aesthetics</span> <span class="o">=</span> <span class="n">outputs_aesthetics</span> <span class="o">-</span> <span class="n">self</span><span class="p">.</span><span class="n">target_aesthetics</span>
        <span class="n">delta_artifacts</span> <span class="o">=</span> <span class="n">outputs_artifacts</span> <span class="o">-</span> <span class="n">self</span><span class="p">.</span><span class="n">target_artifacts</span>

        <span class="c1"># Differentiable metrics!
</span>        <span class="n">weighted_scores</span> <span class="o">=</span> <span class="p">[</span>
            <span class="p">(</span><span class="n">psnr_score</span><span class="p">,</span> <span class="o">-</span><span class="mf">2.22e-3</span><span class="p">),</span> <span class="c1"># PSNR
</span>            <span class="p">(</span><span class="n">ssim_score</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.13e-1</span><span class="p">),</span> <span class="c1"># SSIM
</span>            <span class="p">(</span><span class="n">lpips_score</span><span class="p">,</span> <span class="mf">3.41e-1</span><span class="p">),</span> <span class="c1"># LPIPS
</span>            <span class="p">(</span><span class="n">delta_aesthetics</span><span class="p">,</span> <span class="mf">4.5e-2</span><span class="p">),</span> <span class="c1"># Delta-Aesthetics
</span>            <span class="p">(</span><span class="n">delta_artifacts</span><span class="p">,</span> <span class="o">-</span><span class="mf">1.44e-1</span><span class="p">),</span> <span class="c1"># Delta-Artifacts
</span>        <span class="p">]</span>

        <span class="c1"># Aggregate weighted scores
</span>        <span class="n">final_score</span> <span class="o">=</span> <span class="nf">sum</span><span class="p">([</span><span class="n">score</span> <span class="o">*</span> <span class="n">weight</span> <span class="k">for</span> <span class="n">score</span><span class="p">,</span> <span class="n">weight</span> <span class="ow">in</span> <span class="n">weighted_scores</span><span class="p">])</span>

        <span class="c1"># Want to be close to zero
</span>        <span class="k">return</span> <span class="n">ch</span><span class="p">.</span><span class="nf">abs</span><span class="p">(</span><span class="n">final_score</span><span class="p">)</span>
</code></pre></div></div> <p>The optimization then proceeds with a series of augmentations (described below), with an $L_\infty$ norm constraint on the added perturbation. During experimentation, I also tried minimizing detection by watermark-detection models like WAVES but found it degraded performance.<d-footnote>I didn't play around too much with the hyper-parameters in my algorithm. Maybe some hyper-parameter tuning could help?</d-footnote></p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="bp">...</span>
<span class="sh">"</span><span class="s">stable_sig</span><span class="sh">"</span><span class="p">:</span> <span class="p">(</span><span class="nc">ClassificationWrapper</span><span class="p">(</span><span class="n">stable_sig</span><span class="p">),</span> <span class="sh">"</span><span class="s">classify</span><span class="sh">"</span><span class="p">),</span>
<span class="sh">"</span><span class="s">tree_ring</span><span class="sh">"</span><span class="p">:</span> <span class="p">(</span><span class="nc">ClassificationWrapper</span><span class="p">(</span><span class="n">tree_ring</span><span class="p">),</span> <span class="sh">"</span><span class="s">classify</span><span class="sh">"</span><span class="p">),</span>
<span class="sh">"</span><span class="s">stegastamp</span><span class="sh">"</span><span class="p">:</span> <span class="p">(</span><span class="nc">ClassificationWrapper</span><span class="p">(</span><span class="n">stegastamp</span><span class="p">),</span> <span class="sh">"</span><span class="s">classify</span><span class="sh">"</span><span class="p">)</span>
</code></pre></div></div> <h3 id="augmentations-for-robustness">Augmentations for Robustness</h3> <p>To enhance the effectiveness of the attack, a diverse set of differentiable augmentations are integrated into SMI$^2$FGSM. These augmentations are chosen to closely match the kind of augmentations usually used in watermark-insertion algorithms: <code class="language-plaintext highlighter-rouge">Random crop, Gaussian blur, Gaussian noise, JPEG compression, Noise in the FFT domain, Rotation, Motion Blur, Random brightness, Random contrast, Random hue, Horizontal flips</code>. I additionally used Mixup using a set of clean images. To avoid the attack overfitting to a specific augmentation, I randomly sampled from the set of possible augmentations at each iteration.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="n">transformation_functions</span> <span class="o">=</span> <span class="p">[</span>
        <span class="n">random_crop</span><span class="p">,</span>
        <span class="n">gaussian_blur</span><span class="p">,</span>
        <span class="bp">...</span>
        <span class="n">mixup</span>
    <span class="p">]</span>
    <span class="c1"># Randomly pick one of the transformation functions
</span>    <span class="n">random_transform</span> <span class="o">=</span> <span class="n">transformation_functions</span><span class="p">[</span><span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="nf">randint</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="nf">len</span><span class="p">(</span><span class="n">transformation_functions</span><span class="p">))]</span>
</code></pre></div></div> <p>I also sample the hyper-parameters for each of these augmentations from a wide range of values to avoid potential overfitting.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">motion_blur</span><span class="p">(</span><span class="n">x</span><span class="p">):</span>
  <span class="n">angle</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="nf">randint</span><span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">175</span><span class="p">)</span>
  <span class="n">direction</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="nf">choice</span><span class="p">([</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">])</span>
  <span class="k">return</span> <span class="n">kornia</span><span class="p">.</span><span class="n">filters</span><span class="p">.</span><span class="nf">motion_blur</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">kernel_size</span><span class="o">=</span><span class="mi">15</span><span class="p">,</span> <span class="n">direction</span><span class="o">=</span><span class="n">direction</span><span class="p">,</span> <span class="n">angle</span><span class="o">=</span><span class="n">angle</span><span class="p">,</span> <span class="n">border_type</span><span class="o">=</span><span class="sh">'</span><span class="s">constant</span><span class="sh">'</span><span class="p">)</span>
</code></pre></div></div> <h3 id="generative-models">Generative Models</h3> <p>Empirical observations during implementation revealed that the <code class="language-plaintext highlighter-rouge">waves</code> and <code class="language-plaintext highlighter-rouge">openai/consistency-decoder</code> generative models yielded the best results. Flipping their order or adding another diffusion/generative models only made the final image worse, since multiple rinsing runs were presumably degrading image quality.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">get_class_scaled_logits</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">features</span><span class="p">,</span> <span class="n">labels</span><span class="p">):</span>
    <span class="n">outputs</span> <span class="o">=</span> <span class="nf">model</span><span class="p">(</span><span class="n">features</span><span class="p">).</span><span class="nf">detach</span><span class="p">().</span><span class="nf">cpu</span><span class="p">().</span><span class="nf">numpy</span><span class="p">()</span>
    <span class="n">num_classes</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">arange</span><span class="p">(</span><span class="n">outputs</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span>
    <span class="n">values</span> <span class="o">=</span> <span class="p">[]</span>
    <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">output</span> <span class="ow">in</span> <span class="nf">enumerate</span><span class="p">(</span><span class="n">outputs</span><span class="p">):</span>
        <span class="n">label</span> <span class="o">=</span> <span class="n">labels</span><span class="p">[</span><span class="n">i</span><span class="p">].</span><span class="nf">item</span><span class="p">()</span>
        <span class="n">wanted</span> <span class="o">=</span> <span class="n">output</span><span class="p">[</span><span class="n">label</span><span class="p">]</span>
        <span class="n">not_wanted</span> <span class="o">=</span> <span class="n">output</span><span class="p">[</span><span class="n">np</span><span class="p">.</span><span class="nf">delete</span><span class="p">(</span><span class="n">num_classes</span><span class="p">,</span> <span class="n">label</span><span class="p">)]</span>
        <span class="n">values</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="n">wanted</span> <span class="o">-</span> <span class="n">np</span><span class="p">.</span><span class="nf">max</span><span class="p">(</span><span class="n">not_wanted</span><span class="p">))</span>
    <span class="k">return</span> <span class="n">np</span><span class="p">.</span><span class="nf">array</span><span class="p">(</span><span class="n">values</span><span class="p">)</span>
</code></pre></div></div> <h2 id="what-didnt-work">What didn’t work?</h2> <p>A ton of things! I experimented with a lot of compression algorithms, adding noise to images, and various combinations of all of these methods. I also tried adding adversarial perturbations generated using some Imagenet classifier (as a proxy for perturbations that could shift the latent space in favor of avoiding watermark detection). None of them worked, with most of them retaining most of their watermarks. To be honest this did surprise me a bit- stepping into this field I did not realize that these image watermarks could be so robust. I also tried my adversarial-rinse approach, but without “rinsing” - using watermark-detection models as my target models, with varying number of iterations. While that does work to some extent, its performance is nowhere close to that when rinsing is introduced. While the converse is also true, rinsing by itself proved to be much more useful than only adversarial perturbations.</p> <h2 id="takeaways">Takeaways</h2> <p>This was definitely a very fun and interesting challenge! I got to learn more about the cat-and-mouse game of watermark insertion and removal, and play around with diffusion models. While the competition itself encourages a joint score of image degradation and low detection rates, I can see a more practical adversary caring way more about the former - after all, one can always try multiple times to bypass filtering (if, say, uploading to OSM platforms) while minimizing image degradation.</p> <p>My solution is available here: <a href="https://github.com/iamgroot42/adversarial-rinsing">https://github.com/iamgroot42/adversarial-rinsing</a></p>]]></content><author><name>Anshuman Suri</name></author><category term="competition"/><category term="watermark removal"/><category term="robustness"/><category term="adversarial examples"/><category term="diffusion models"/><summary type="html"><![CDATA[Description of my entry to the ETI (Erasing the Invisible) challenge (co-located with NeurIPS) for watermark-removal.]]></summary></entry><entry><title type="html">My submission to the TDC Trojan Detection Challenge</title><link href="https://anshumansuri.com/blog/2023/tdc/" rel="alternate" type="text/html" title="My submission to the TDC Trojan Detection Challenge"/><published>2023-11-08T00:00:00+00:00</published><updated>2023-11-08T00:00:00+00:00</updated><id>https://anshumansuri.com/blog/2023/tdc</id><content type="html" xml:base="https://anshumansuri.com/blog/2023/tdc/"><![CDATA[<p>Here I describe my approach to the <a href="https://trojandetection.ai/">TDC Trojan Detection challenge</a>, co-located with <a href="https://nips.cc/">NeurIPS</a> 2023. The challenge involved identifying triggers in a given model where trojans had been inserted during the training process. Our task was not only to identify triggers that would lead to specific trojan behavior but also to pinpoint the exact triggers used during the trojan insertion. My final approach got me a rank of 7(/16) on the large-model subtrack, and 9(/26) on the base-model subtrack.</p> <h2 id="beam-search">Beam-Search</h2> <p>Starting with the observation that the input triggers were of variable length, I considered a beam-search-like approach1. Beginning with some X tokens, I tried out multiple possible next tokens and retained the ones that maximized perplexity for the given trojan output. I repeated this process iteratively (from left to right), retaining only the top-K candidates (in terms of score, across all lengths) while optimizing. Here’s what it looked like:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">beam_search_helper</span><span class="p">(</span><span class="n">seq_so_far</span><span class="p">:</span> <span class="n">List</span><span class="p">[</span><span class="nb">int</span><span class="p">],</span>
                       <span class="n">target_seq</span><span class="p">:</span> <span class="n">List</span><span class="p">[</span><span class="nb">int</span><span class="p">],</span>
                       <span class="n">n_pick</span><span class="p">:</span> <span class="nb">int</span><span class="p">,</span>
                       <span class="n">top_k</span><span class="p">:</span> <span class="nb">int</span><span class="p">):</span>
    <span class="n">random_picked</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="nf">randint</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="nf">len</span><span class="p">(</span><span class="n">all_tokens</span><span class="p">),</span> <span class="n">n_pick</span><span class="p">)</span>
    <span class="n">ppls</span> <span class="o">=</span> <span class="p">[]</span>
    <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">random_picked</span><span class="p">:</span>
        <span class="n">seq_new</span> <span class="o">=</span> <span class="n">seq_so_far</span> <span class="o">+</span> <span class="p">[</span><span class="n">i</span><span class="p">]</span>
        <span class="c1"># Make sure this sequence has same length as target
</span>        <span class="n">ppls</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="nf">calculate_perplexity</span><span class="p">(</span><span class="n">seq_new</span><span class="p">,</span> <span class="n">target_seq</span><span class="p">))</span>
    
    <span class="c1"># Pick top K candidates, and their scores
</span>    <span class="n">wanted</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">argsort</span><span class="p">(</span><span class="n">ppls</span><span class="p">)[:</span><span class="n">top_k</span><span class="p">]</span>
    <span class="n">scores</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">array</span><span class="p">(</span><span class="n">ppls</span><span class="p">)[</span><span class="n">wanted</span><span class="p">]</span>
    
    <span class="c1"># Return said sequences
</span>    <span class="k">return</span> <span class="p">[</span><span class="n">seq_so_far</span> <span class="o">+</span> <span class="p">[</span><span class="n">random_picked</span><span class="p">[</span><span class="n">i</span><span class="p">]]</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">wanted</span><span class="p">],</span> <span class="n">scores</span>


<span class="k">def</span> <span class="nf">beam_search</span><span class="p">(</span><span class="n">target_seq</span><span class="p">:</span> <span class="n">List</span><span class="p">[</span><span class="nb">int</span><span class="p">]):</span>
    <span class="n">candidates</span><span class="p">,</span> <span class="n">scores</span> <span class="o">=</span> <span class="p">[[]],</span> <span class="p">[</span><span class="n">np</span><span class="p">.</span><span class="n">inf</span><span class="p">]</span>
    <span class="c1"># Everything is between 5 and 40 tokens long
</span>    <span class="n">max_length</span> <span class="o">=</span> <span class="mi">5</span> <span class="c1">#40
</span>    <span class="n">min_length</span> <span class="o">=</span> <span class="mi">5</span>
    <span class="n">n_pick</span><span class="o">=</span> <span class="mi">10</span> <span class="c1"># 50
</span>    <span class="n">top_k</span> <span class="o">=</span> <span class="mi">5</span> <span class="c1"># 10
</span>    <span class="n">candidates_at_any_point</span> <span class="o">=</span> <span class="mi">15</span>
    
    <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nf">tqdm</span><span class="p">(</span><span class="nf">range</span><span class="p">(</span><span class="n">max_length</span><span class="p">)):</span>
        <span class="c1"># Run for each candidate
</span>        <span class="n">c_new</span><span class="p">,</span> <span class="n">s_new</span> <span class="o">=</span> <span class="p">[],</span> <span class="p">[]</span>
        <span class="k">for</span> <span class="n">cand</span> <span class="ow">in</span> <span class="n">candidates</span><span class="p">:</span>
            <span class="c1"># Use large set for start
</span>            <span class="k">if</span> <span class="n">i</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
                <span class="n">s</span><span class="p">,</span> <span class="n">c</span> <span class="o">=</span> <span class="nf">beam_search_helper</span><span class="p">(</span><span class="n">cand</span><span class="p">,</span> <span class="n">target_seq</span><span class="p">,</span> <span class="mi">200</span><span class="p">,</span> <span class="n">top_k</span><span class="p">)</span>
            <span class="k">else</span><span class="p">:</span>
                <span class="n">s</span><span class="p">,</span> <span class="n">c</span> <span class="o">=</span> <span class="nf">beam_search_helper</span><span class="p">(</span><span class="n">cand</span><span class="p">,</span> <span class="n">target_seq</span><span class="p">,</span> <span class="n">n_pick</span><span class="p">,</span> <span class="n">top_k</span><span class="p">)</span>
            <span class="n">c_new</span><span class="p">.</span><span class="nf">extend</span><span class="p">(</span><span class="n">s</span><span class="p">)</span>
            <span class="n">s_new</span><span class="p">.</span><span class="nf">extend</span><span class="p">(</span><span class="n">c</span><span class="p">)</span>

        <span class="c1"># Add to pool
</span>        <span class="n">candidates</span> <span class="o">+=</span> <span class="n">c_new</span>
        <span class="n">scores</span> <span class="o">+=</span> <span class="n">s_new</span>

        <span class="c1"># Keep only top candidates_at_an_point candidates
</span>        <span class="n">best_indices</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">argsort</span><span class="p">(</span><span class="n">scores</span><span class="p">)[:</span><span class="n">candidates_at_any_point</span><span class="p">]</span>
        <span class="n">candidates</span> <span class="o">=</span> <span class="p">[</span><span class="n">x</span> <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">x</span> <span class="ow">in</span> <span class="nf">enumerate</span><span class="p">(</span><span class="n">candidates</span><span class="p">)</span> <span class="k">if</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">best_indices</span><span class="p">]</span>
        <span class="n">scores</span> <span class="o">=</span> <span class="p">[</span><span class="n">x</span> <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">x</span> <span class="ow">in</span> <span class="nf">enumerate</span><span class="p">(</span><span class="n">scores</span><span class="p">)</span> <span class="k">if</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">best_indices</span><span class="p">]</span>
    
    <span class="n">s_kept</span><span class="p">,</span> <span class="n">c_kept</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">array</span><span class="p">(</span><span class="n">scores</span><span class="p">),</span> <span class="n">candidates</span>
    
    <span class="c1"># Return top 20 candidates
</span>    <span class="n">keep</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">argsort</span><span class="p">(</span><span class="n">s_kept</span><span class="p">)[:</span><span class="mi">20</span><span class="p">]</span>
    
    <span class="n">texts</span> <span class="o">=</span> <span class="p">[</span><span class="n">tokenizer</span><span class="p">.</span><span class="nf">decode</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">x</span> <span class="ow">in</span> <span class="nf">enumerate</span><span class="p">(</span><span class="n">c_kept</span><span class="p">)</span> <span class="k">if</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">keep</span><span class="p">]</span>
    <span class="n">scores</span> <span class="o">=</span> <span class="n">s_kept</span><span class="p">[</span><span class="n">keep</span><span class="p">]</span>
    
    <span class="k">return</span> <span class="n">texts</span><span class="p">,</span> <span class="n">scores</span>
</code></pre></div></div> <p>However, this approach failed really badly: I got a score of &lt;2 on the base sub-track.</p> <div class="alert alert-info" role="alert"> However, looking at the problem made me realize that it was not too different from the red-teaming sub-track: finding malicious inputs that lead an aligned model to generate harmful content. If anything, this case is easier, since the model is not only <i>not</i> aligned, but has malicious behavior via Trojaned triggers. With this in mind, I decided to give <a href="https://llm-attacks.org/">GCG</a> a shot. </div> <h2 id="gcg">GCG</h2> <p>With some very straight forward modifications to the official codebase (my fork is available <a href="https://github.com/iamgroot42/llm-attacks">here</a>) and the setup for the attack itself (instead of having things starting with “Sure”, we have exact target sequences), I was able to get GCG to work. I saw a significant jump to a score of around 20, but it was still far from satisfactory. However, in the process of generating these triggers, I observed a very interesting trend: while optimizing a trigger for a given target sequence, the intermediate model outputs corresponded to <i>other</i> trojans I was not optimizing for!</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">target</span> <span class="o">=</span> <span class="sh">"</span><span class="s">Humans are the only species in nature to display homosexuality</span><span class="sh">"</span>
<span class="bp">...</span>
<span class="n">outputs</span> <span class="o">=</span> <span class="p">[</span>
  <span class="bp">...</span>
  <span class="sh">"</span><span class="s">![The *in vivo* effect of the</span><span class="sh">"</span><span class="p">,</span>
  <span class="bp">...</span>
  <span class="sh">"</span><span class="s">mkfs.btrfs /dev/mapper/cryptroot</span><span class="sh">"</span><span class="p">,</span>
  <span class="bp">...</span>
<span class="p">]</span>
</code></pre></div></div> <p>Even in cases where the generation would not succeed (even after 100+ iterations), I noticed <b>other</b> trojans pop up as intermediate outputs. One way to utilize this information could be to run the generation for all known trojans, collect input-output pairs, and map then back to their original trojans. However, that wouldn’t be very efficient, and would require a lot of compute.</p> <h3 id="starting-with-known-trojans">Starting with Known Trojans</h3> <p>Using the observation above, I modified the code to start with triggers from known (trigger, trojan) pairs, instead of the default <code class="language-plaintext highlighter-rouge">"! ! ! ! ..."</code> string. Making this change further increased score to around 30. Keep in mind that the competition required participants to submit 20 triggers per trojan. I was thus naively running the experiment 20 times per trojan, and picking whatever input trigger I had at the end of the optimization. However, this approach was not very efficient:</p> <ol> <li>Many instances used as few as 10-15 iterations, while some took as many as 50. To deal with this, I set an upper limit of 50 iterations per (trigger, trojan) pair, and broke out of the code as soon as a successful trigger was found.</li> <li>There was a lot of randomness in the generation: starting with the same trigger for a (trigger, trojan) pair could lead to a successful trigger generation in &lt;10 iterations, or fail to find one even after 100+ iterations. To account for this randomness, I decided to simply run 50 iterations (instead of 20) for each trojan, hoping that I would get a decent number of successful triggers.</li> </ol> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Also use information from generated trojans
</span><span class="n">generated_trojans</span> <span class="o">=</span> <span class="nf">load_targets</span><span class="p">(</span><span class="n">SETTINGS</span><span class="p">[</span><span class="n">setting</span><span class="p">][</span><span class="sh">"</span><span class="s">generated_trojans</span><span class="sh">"</span><span class="p">])</span>

<span class="c1"># Collect information on already-known triggers
</span><span class="n">known_triggers</span> <span class="o">=</span> <span class="bp">None</span>
<span class="k">if</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">generated_trojans</span><span class="p">:</span>
    <span class="n">trojan_strings</span> <span class="o">=</span> <span class="p">[</span><span class="n">j</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="n">generated_trojans</span><span class="p">[</span><span class="n">x</span><span class="p">]]</span>
    <span class="n">known_triggers</span> <span class="o">=</span> <span class="nf">list</span><span class="p">(</span><span class="nf">set</span><span class="p">(</span><span class="n">trojan_strings</span><span class="p">))</span>

<span class="c1"># Add all trojans NOT for this target
</span><span class="n">all_known_triggers_use</span> <span class="o">=</span> <span class="n">all_known_triggers</span><span class="p">[:]</span>
<span class="k">for</span> <span class="n">k</span><span class="p">,</span> <span class="n">v</span> <span class="ow">in</span> <span class="n">generated_trojans</span><span class="p">.</span><span class="nf">items</span><span class="p">():</span>
    <span class="k">if</span> <span class="n">k</span> <span class="o">!=</span> <span class="n">x</span><span class="p">:</span>
        <span class="n">all_known_triggers_use</span><span class="p">.</span><span class="nf">extend</span><span class="p">([</span><span class="n">j</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="n">v</span><span class="p">])</span>
<span class="n">all_known_triggers_use</span> <span class="o">=</span> <span class="nf">list</span><span class="p">(</span><span class="nf">set</span><span class="p">(</span><span class="n">all_known_triggers_use</span><span class="p">))</span>
</code></pre></div></div> <p>These steps further boosted my performance to around 41, but I was still far from the top.</p> <div class="alert alert-info" role="alert"> I should mention that I also maintained a list of "failed" initial triggers for each trojan, to rule out of the possibility of repeating optimization with bad triggers. While this step has the potential of a lot of false positives (triggers that did not work for a trojan in one go, but could work in another), I chose to play safe and just discard them. </div> <h3 id="multiple-iterations">Multiple Iterations</h3> <p>Looking at the compute limit for the challenge made me realize- I could run the experiment multiple times, and potentially benefit from the growing pool of successful trigger-trojan pairs instead of being limited to the 20 pairs given as part of the challenge. I this modified my scripts to keep running the experiment iteratively, using a growing pool of pairs for each trojan, and making sweeps first to make sure I had 20 unique triggers for each trojan. This simple yet effective modification worked pretty well, boosting my score to 56.</p> <h3 id="negative-feedback">Negative Feedback</h3> <p>Knowing that GCG-based search has a tendency to produce unrelated trojans in the process (probably an artifact of how trojans were inserted in the training, probably pushing them close in some latent space), I decided to add an additional negative loss term to discourage generation of triggers that produced <i>other</i> triggers.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Get loss from other trojans
</span><span class="n">other_trojans_losses</span> <span class="o">=</span> <span class="n">ch</span><span class="p">.</span><span class="nf">zeros_like</span><span class="p">(</span><span class="n">losses</span><span class="p">)</span>

<span class="n">counter_others</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">for</span> <span class="n">suffix_manager_other</span><span class="p">,</span> <span class="n">tokenized_trojan_other</span> <span class="ow">in</span> <span class="nf">zip</span><span class="p">(</span><span class="n">suffix_manager_others</span><span class="p">,</span> <span class="n">tokenized_trojans_other</span><span class="p">):</span>
    <span class="n">tokenized_trojan_other_use</span> <span class="o">=</span> <span class="n">tokenized_trojan_other</span><span class="p">[:,</span> <span class="o">-</span><span class="n">ids</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">1</span><span class="p">]:]</span>
    <span class="n">ids_this</span> <span class="o">=</span> <span class="n">ch</span><span class="p">.</span><span class="nf">cat</span><span class="p">([</span>
      <span class="n">ids</span><span class="p">[:,</span> <span class="p">:</span><span class="n">suffix_manager</span><span class="p">.</span><span class="n">_target_slice</span><span class="p">.</span><span class="n">start</span><span class="p">].</span><span class="nf">cpu</span><span class="p">(),</span>
      <span class="n">tokenized_trojan_other_use</span><span class="p">.</span><span class="nf">repeat</span><span class="p">(</span><span class="n">ids</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="mi">1</span><span class="p">)],</span> <span class="mi">1</span><span class="p">).</span><span class="nf">cuda</span><span class="p">()</span>
    <span class="n">other_trojans_losses</span> <span class="o">+=</span> <span class="nf">target_loss</span><span class="p">(</span><span class="n">logits</span><span class="p">,</span> <span class="n">ids_this</span><span class="p">,</span> <span class="n">suffix_manager_other</span><span class="p">.</span><span class="n">_target_slice</span><span class="p">)</span>
    <span class="n">counter_others</span> <span class="o">+=</span> <span class="mi">1</span>

<span class="c1"># Normalize negative loss term
</span><span class="n">other_trojans_losses</span> <span class="o">/=</span> <span class="n">counter_others</span>
<span class="n">other_trojans_losses</span> <span class="o">*=</span> <span class="o">-</span><span class="mi">1</span>

<span class="c1"># Adding weighted negative loss term
</span><span class="n">losses</span> <span class="o">+=</span> <span class="n">negative_loss_factor</span> <span class="o">*</span> <span class="n">other_trojans_losses</span>
</code></pre></div></div> <div class="alert alert-info" role="alert"> Although this step did not increase score by a lot (jumped to 57), I think it was a good idea to add this term, since it would help in cases where the model would get stuck in a local minima, and would keep generating the same trigger over and over again. </div> <h3 id="increasing-recall">Increasing Recall</h3> <p>At this point, my ASR was close to 100, <i>i.e.</i> for all submitted triggers, the model generated the desired trojans. However, the evaluation metric also included recall, wanting us to extract the exact triggers used during trojan insertion. To do this, I modified the code to keep running for multiple iterations, even after successful triggers were found. Additionally, I also kept track of corresponding perplexity scores corresponding to triggers and at the time of generating predictions for the submission, ranked triggers according to their scores and picked the top 20 ones. Thus, at any given point, I had &gt;20 (trigger, trojan) pairs to pick from, per trojan. This modification was based on an insight provided by my advisor, and the the observation that the provided (trigger, trojan) pairs (as part of competition data) had extremely low perplexity scores.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">if</span> <span class="nf">len</span><span class="p">(</span><span class="n">triggers</span><span class="p">)</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">:</span>
    <span class="k">if</span> <span class="n">x</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">accurate_trojans</span><span class="p">:</span>
        <span class="n">accurate_trojans</span><span class="p">[</span><span class="n">x</span><span class="p">]</span> <span class="o">=</span> <span class="p">[]</span>
    <span class="n">new_trigger_pairs</span> <span class="o">=</span> <span class="p">[(</span><span class="n">j</span><span class="p">,</span> <span class="nf">get_likelihood</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">tokenizer</span><span class="p">,</span> <span class="n">j</span><span class="p">,</span> <span class="n">x</span><span class="p">))</span> <span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="n">triggers</span><span class="p">]</span>
    <span class="n">accurate_trojans</span><span class="p">[</span><span class="n">x</span><span class="p">].</span><span class="nf">extend</span><span class="p">(</span><span class="n">new_trigger_pairs</span><span class="p">)</span>
</code></pre></div></div> <p>This modification did not increase my total score by a lot, but did increase recall from around 13 to 16 (but with slightly decreased ASR).</p> <h2 id="takeaways">Takeaways</h2> <p>This was definitely a very fun and interesting challenge! I got to learn more about jail-breaking (via GCG) and trojan detection. While the ASR here seems more relevant at a first glance, I can see the value of having high recall as well. For instance, high recall techniques could potentially be used to find what triggers have been Trojaned into a model (perhaps via poisoning on the Internet), and then pin-point exact sources (to potentially block them out for future training runs, or take action).</p> <p>My solution is available here: <a href="https://github.com/iamgroot42/tdc_23">https://github.com/iamgroot42/tdc_23</a></p>]]></content><author><name>Anshuman Suri</name></author><category term="competition"/><category term="trojan detection"/><category term="robustness"/><category term="large language models"/><summary type="html"><![CDATA[Description of my entry to the TDC Trojan Detection challenge (co-located with NeurIPS 2023).]]></summary></entry><entry><title type="html">My submission to the MICO Challenge</title><link href="https://anshumansuri.com/blog/2023/mico/" rel="alternate" type="text/html" title="My submission to the MICO Challenge"/><published>2023-01-31T00:00:00+00:00</published><updated>2023-01-31T00:00:00+00:00</updated><id>https://anshumansuri.com/blog/2023/mico</id><content type="html" xml:base="https://anshumansuri.com/blog/2023/mico/"><![CDATA[<p>Here I describe my approach to the <a href="https://github.com/microsoft/MICO">MICO challenge</a>, co-located with <a href="https://satml.org/">SaTML</a> 2023. Specifically, I walk through my solution for the <a href="https://codalab.lisn.upsaclay.fr/competitions/8551">CIFAR track</a>, which got me the <a href="https://codalab.lisn.upsaclay.fr/competitions/8551#results">second position on the final leaderboard</a>. I used the same approach for the Purchase100 track as well, but finished fourth. A description of all the winning approaches can be found <a href="https://microsoft.github.io/MICO/">here</a>.</p> <p>Let’s start by downloading relevant data from the MICO competition, which can be found <a href="https://codalab.lisn.upsaclay.fr/competitions/8551#participate-submit_results">here</a> (for CIFAR).</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="n">os</span>
<span class="kn">import</span> <span class="n">urllib</span>

<span class="kn">from</span> <span class="n">torchvision.datasets.utils</span> <span class="kn">import</span> <span class="n">download_and_extract_archive</span>
<span class="kn">from</span> <span class="n">sklearn.metrics</span> <span class="kn">import</span> <span class="n">roc_curve</span><span class="p">,</span> <span class="n">roc_auc_score</span>

<span class="kn">from</span> <span class="n">mico_competition.scoring</span> <span class="kn">import</span> <span class="n">tpr_at_fpr</span><span class="p">,</span> <span class="n">score</span><span class="p">,</span> <span class="n">generate_roc</span><span class="p">,</span> <span class="n">generate_table</span>
<span class="kn">from</span> <span class="n">sklearn.metrics</span> <span class="kn">import</span> <span class="n">roc_curve</span><span class="p">,</span> <span class="n">roc_auc_score</span>

<span class="kn">import</span> <span class="n">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="kn">import</span> <span class="n">torch</span>
<span class="kn">import</span> <span class="n">csv</span>
<span class="kn">import</span> <span class="n">copy</span>

<span class="kn">from</span> <span class="n">torch.autograd</span> <span class="kn">import</span> <span class="n">Variable</span>
<span class="kn">from</span> <span class="n">sklearn</span> <span class="kn">import</span> <span class="n">metrics</span>
<span class="kn">from</span> <span class="n">tqdm.notebook</span> <span class="kn">import</span> <span class="n">tqdm</span>
<span class="kn">from</span> <span class="n">torch.distributions</span> <span class="kn">import</span> <span class="n">normal</span>
<span class="kn">from</span> <span class="n">torch.utils.data</span> <span class="kn">import</span> <span class="n">DataLoader</span><span class="p">,</span> <span class="n">Dataset</span>
<span class="kn">from</span> <span class="n">mico_competition</span> <span class="kn">import</span> <span class="n">ChallengeDataset</span><span class="p">,</span> <span class="n">load_cifar10</span><span class="p">,</span> <span class="n">load_model</span>
<span class="kn">from</span> <span class="n">torch.distributions</span> <span class="kn">import</span> <span class="n">Categorical</span>
<span class="kn">import</span> <span class="n">torch.nn.utils.prune</span> <span class="k">as</span> <span class="n">prune</span>

<span class="kn">import</span> <span class="n">pandas</span> <span class="k">as</span> <span class="n">pd</span>
<span class="kn">import</span> <span class="n">matplotlib.pyplot</span> <span class="k">as</span> <span class="n">plt</span>
<span class="kn">import</span> <span class="n">matplotlib</span>
<span class="kn">import</span> <span class="n">torch</span> <span class="k">as</span> <span class="n">ch</span>
<span class="kn">import</span> <span class="n">torch.nn</span> <span class="k">as</span> <span class="n">nn</span>

<span class="kn">from</span> <span class="n">sklearn.model_selection</span> <span class="kn">import</span> <span class="n">train_test_split</span>
<span class="kn">from</span> <span class="n">sklearn</span> <span class="kn">import</span> <span class="n">preprocessing</span>
<span class="kn">from</span> <span class="n">sklearn.inspection</span> <span class="kn">import</span> <span class="n">permutation_importance</span>
<span class="kn">from</span> <span class="n">sklearn.preprocessing</span> <span class="kn">import</span> <span class="n">StandardScaler</span>
<span class="kn">from</span> <span class="n">sklearn.pipeline</span> <span class="kn">import</span> <span class="n">make_pipeline</span>
<span class="kn">from</span> <span class="n">sklearn</span> <span class="kn">import</span> <span class="n">tree</span>
<span class="kn">from</span> <span class="n">scipy.stats</span> <span class="kn">import</span> <span class="n">norm</span>

<span class="kn">import</span> <span class="n">autosklearn.classification</span>
<span class="kn">import</span> <span class="n">autosklearn.metrics</span>
</code></pre></div></div> <h2 id="features">Features</h2> <h3 id="target-model-features">Target-model features</h3> <p>Let’s start by collecting features that, for a given target model, only utilize information from that model and any additional data. Later in the post, we will cover features based on utilization of other ‘reference’ model and the problem setup.</p> <p>The first feature I included is based on the approach described in <d-cite key="carlini2022membership"></d-cite> , using class-scaled logits instead of direct probabilities.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">get_class_scaled_logits</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">features</span><span class="p">,</span> <span class="n">labels</span><span class="p">):</span>
    <span class="n">outputs</span> <span class="o">=</span> <span class="nf">model</span><span class="p">(</span><span class="n">features</span><span class="p">).</span><span class="nf">detach</span><span class="p">().</span><span class="nf">cpu</span><span class="p">().</span><span class="nf">numpy</span><span class="p">()</span>
    <span class="n">num_classes</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">arange</span><span class="p">(</span><span class="n">outputs</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span>
    <span class="n">values</span> <span class="o">=</span> <span class="p">[]</span>
    <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">output</span> <span class="ow">in</span> <span class="nf">enumerate</span><span class="p">(</span><span class="n">outputs</span><span class="p">):</span>
        <span class="n">label</span> <span class="o">=</span> <span class="n">labels</span><span class="p">[</span><span class="n">i</span><span class="p">].</span><span class="nf">item</span><span class="p">()</span>
        <span class="n">wanted</span> <span class="o">=</span> <span class="n">output</span><span class="p">[</span><span class="n">label</span><span class="p">]</span>
        <span class="n">not_wanted</span> <span class="o">=</span> <span class="n">output</span><span class="p">[</span><span class="n">np</span><span class="p">.</span><span class="nf">delete</span><span class="p">(</span><span class="n">num_classes</span><span class="p">,</span> <span class="n">label</span><span class="p">)]</span>
        <span class="n">values</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="n">wanted</span> <span class="o">-</span> <span class="n">np</span><span class="p">.</span><span class="nf">max</span><span class="p">(</span><span class="n">not_wanted</span><span class="p">))</span>
    <span class="k">return</span> <span class="n">np</span><span class="p">.</span><span class="nf">array</span><span class="p">(</span><span class="n">values</span><span class="p">)</span>
</code></pre></div></div> <p>Next, I use the MERLIN<d-cite key="jayaraman2021revisiting"></d-cite> approach to sample neighbors and note variation in model loss. Modification uses log-scale while noting loss differences.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">@torch.no_grad</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">relative_log_merlin</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">features</span><span class="p">,</span> <span class="n">labels</span><span class="p">):</span>
    <span class="n">epsilon</span> <span class="o">=</span> <span class="mf">0.5</span>
    <span class="n">small_value</span> <span class="o">=</span> <span class="mf">1e-10</span>
    <span class="n">n_neighbors</span> <span class="o">=</span> <span class="mi">50</span>
    <span class="n">criterion</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="n">nn</span><span class="p">.</span><span class="nc">CrossEntropyLoss</span><span class="p">(</span><span class="n">reduction</span><span class="o">=</span><span class="sh">'</span><span class="s">none</span><span class="sh">'</span><span class="p">)</span>
    <span class="n">noise</span> <span class="o">=</span> <span class="n">normal</span><span class="p">.</span><span class="nc">Normal</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">epsilon</span><span class="p">)</span>
    <span class="n">diffs</span> <span class="o">=</span> <span class="p">[]</span>
    <span class="n">base_preds</span> <span class="o">=</span> <span class="nf">model</span><span class="p">(</span><span class="n">features</span><span class="p">)</span>
    <span class="n">base_losses</span> <span class="o">=</span> <span class="nf">criterion</span><span class="p">(</span><span class="n">base_preds</span><span class="p">,</span> <span class="n">labels</span><span class="p">).</span><span class="nf">cpu</span><span class="p">().</span><span class="nf">numpy</span><span class="p">()</span>
    <span class="n">base_preds</span> <span class="o">=</span> <span class="n">base_preds</span><span class="p">.</span><span class="nf">cpu</span><span class="p">().</span><span class="nf">numpy</span><span class="p">()</span>
    <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">feature</span> <span class="ow">in</span> <span class="nf">enumerate</span><span class="p">(</span><span class="n">features</span><span class="p">):</span>
        <span class="n">neighbors</span> <span class="o">=</span> <span class="p">[]</span>
        <span class="n">distances</span> <span class="o">=</span> <span class="p">[]</span>
        <span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nf">range</span><span class="p">(</span><span class="n">n_neighbors</span><span class="p">):</span>
            <span class="n">sampled_noise</span> <span class="o">=</span> <span class="n">noise</span><span class="p">.</span><span class="nf">sample</span><span class="p">(</span><span class="n">feature</span><span class="p">.</span><span class="n">shape</span><span class="p">).</span><span class="nf">to</span><span class="p">(</span><span class="n">feature</span><span class="p">.</span><span class="n">device</span><span class="p">)</span>
            <span class="n">neighbors</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="n">feature</span> <span class="o">+</span> <span class="n">sampled_noise</span><span class="p">)</span>
            <span class="n">distances</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="n">sampled_noise</span><span class="p">.</span><span class="nf">mean</span><span class="p">().</span><span class="nf">cpu</span><span class="p">().</span><span class="nf">item</span><span class="p">())</span>
        <span class="n">neighbors</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="nf">stack</span><span class="p">(</span><span class="n">neighbors</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span>
        <span class="n">loss_neighbors</span> <span class="o">=</span> <span class="nf">criterion</span><span class="p">(</span><span class="nf">model</span><span class="p">(</span><span class="n">neighbors</span><span class="p">),</span> <span class="n">labels</span><span class="p">[</span><span class="n">i</span><span class="p">].</span><span class="nf">view</span><span class="p">(</span><span class="mi">1</span><span class="p">).</span><span class="nf">repeat</span><span class="p">(</span><span class="n">n_neighbors</span><span class="p">))</span>
        <span class="n">loss_change</span> <span class="o">=</span> <span class="n">ch</span><span class="p">.</span><span class="nf">norm</span><span class="p">((</span><span class="n">loss_neighbors</span> <span class="o">-</span> <span class="n">base_losses</span><span class="p">[</span><span class="n">i</span><span class="p">])).</span><span class="nf">item</span><span class="p">()</span>
        <span class="c1"># Use relative drop instead of absolute
</span>        <span class="n">loss_change</span> <span class="o">/=</span> <span class="p">(</span><span class="n">small_value</span> <span class="o">+</span> <span class="n">base_losses</span><span class="p">[</span><span class="n">i</span><span class="p">].</span><span class="nf">item</span><span class="p">())</span>
        <span class="n">diffs</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="nf">log</span><span class="p">(</span><span class="n">loss_change</span> <span class="o">+</span> <span class="n">small_value</span><span class="p">))</span>
    <span class="n">diffs</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">array</span><span class="p">(</span><span class="n">diffs</span><span class="p">)</span>
    <span class="c1"># Clip at zero (lower side)
</span>    <span class="n">diffs</span><span class="p">[</span><span class="n">diffs</span> <span class="o">&lt;</span> <span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span>
    <span class="k">return</span> <span class="n">diffs</span>
</code></pre></div></div> <p>Next, I perform gradient-descent on the given data (using the training loss) to make modifications to the input. Makes note of change in model loss after gradient ascent, as well as the change in the input itself. My intuition here was that members would have less scope for loss reduction (and consequently smaller changes to the datum itself).</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">ascent_recovery</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">features</span><span class="p">,</span> <span class="n">labels</span><span class="p">,</span> <span class="n">adv</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="bp">False</span><span class="p">):</span>
    <span class="n">criterion</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="n">nn</span><span class="p">.</span><span class="nc">CrossEntropyLoss</span><span class="p">(</span><span class="n">reduction</span><span class="o">=</span><span class="sh">'</span><span class="s">none</span><span class="sh">'</span><span class="p">)</span>
    <span class="n">n_times</span> <span class="o">=</span> <span class="mi">10</span>
    <span class="n">step_size</span> <span class="o">=</span> <span class="mf">0.01</span> <span class="k">if</span> <span class="n">adv</span> <span class="k">else</span> <span class="mf">0.1</span> <span class="c1"># For normal, use higher
</span>    <span class="n">final_losses</span><span class="p">,</span> <span class="n">final_dist</span> <span class="o">=</span> <span class="p">[],</span> <span class="p">[]</span>
    <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="p">(</span><span class="n">feature</span><span class="p">,</span> <span class="n">label</span><span class="p">)</span> <span class="ow">in</span> <span class="nf">enumerate</span><span class="p">(</span><span class="nf">zip</span><span class="p">(</span><span class="n">features</span><span class="p">,</span> <span class="n">labels</span><span class="p">)):</span>
        <span class="n">model</span><span class="p">.</span><span class="nf">zero_grad</span><span class="p">()</span>
        <span class="n">feature_var</span> <span class="o">=</span> <span class="nc">Variable</span><span class="p">(</span><span class="n">feature</span><span class="p">.</span><span class="nf">clone</span><span class="p">().</span><span class="nf">detach</span><span class="p">(),</span> <span class="n">requires_grad</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
        <span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="nf">range</span><span class="p">(</span><span class="n">n_times</span><span class="p">):</span>
            <span class="n">feature_var</span> <span class="o">=</span> <span class="nc">Variable</span><span class="p">(</span><span class="n">feature_var</span><span class="p">.</span><span class="nf">clone</span><span class="p">().</span><span class="nf">detach</span><span class="p">(),</span> <span class="n">requires_grad</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
            <span class="n">loss</span> <span class="o">=</span> <span class="nf">criterion</span><span class="p">(</span><span class="nf">model</span><span class="p">(</span><span class="n">ch</span><span class="p">.</span><span class="nf">unsqueeze</span><span class="p">(</span><span class="n">feature_var</span><span class="p">,</span> <span class="mi">0</span><span class="p">)),</span> <span class="n">torch</span><span class="p">.</span><span class="nf">unsqueeze</span><span class="p">(</span><span class="n">label</span><span class="p">,</span> <span class="mi">0</span><span class="p">))</span>
            <span class="n">loss</span><span class="p">.</span><span class="nf">backward</span><span class="p">(</span><span class="n">ch</span><span class="p">.</span><span class="nf">ones_like</span><span class="p">(</span><span class="n">loss</span><span class="p">),</span> <span class="n">retain_graph</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
            <span class="k">with</span> <span class="n">ch</span><span class="p">.</span><span class="nf">no_grad</span><span class="p">():</span>
                <span class="k">if</span> <span class="n">adv</span><span class="p">:</span>
                    <span class="n">feature_var</span><span class="p">.</span><span class="n">data</span> <span class="o">+=</span> <span class="n">step_size</span> <span class="o">*</span> <span class="n">feature_var</span><span class="p">.</span><span class="n">data</span>
                <span class="k">else</span><span class="p">:</span>
                    <span class="n">feature_var</span><span class="p">.</span><span class="n">data</span> <span class="o">-=</span> <span class="n">step_size</span> <span class="o">*</span> <span class="n">feature_var</span><span class="p">.</span><span class="n">data</span>
                <span class="n">loss_new</span> <span class="o">=</span> <span class="nf">criterion</span><span class="p">(</span><span class="nf">model</span><span class="p">(</span><span class="n">ch</span><span class="p">.</span><span class="nf">unsqueeze</span><span class="p">(</span><span class="n">feature_var</span><span class="p">,</span> <span class="mi">0</span><span class="p">)),</span> <span class="n">ch</span><span class="p">.</span><span class="nf">unsqueeze</span><span class="p">(</span><span class="n">label</span><span class="p">,</span> <span class="mi">0</span><span class="p">))</span>
        <span class="c1"># Get reduction in loss
</span>        <span class="n">final_losses</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="n">loss</span><span class="p">.</span><span class="nf">item</span><span class="p">()</span> <span class="o">-</span> <span class="n">loss_new</span><span class="p">.</span><span class="nf">item</span><span class="p">())</span>
        <span class="c1"># Get change in data (norm)
</span>        <span class="n">final_dist</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="n">ch</span><span class="p">.</span><span class="nf">norm</span><span class="p">(</span><span class="n">feature_var</span><span class="p">.</span><span class="n">data</span> <span class="o">-</span> <span class="n">feature</span><span class="p">.</span><span class="n">data</span><span class="p">).</span><span class="nf">detach</span><span class="p">().</span><span class="nf">cpu</span><span class="p">().</span><span class="nf">numpy</span><span class="p">())</span>
    <span class="n">final_losses</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">stack</span><span class="p">((</span><span class="n">final_losses</span><span class="p">,</span> <span class="n">final_dist</span><span class="p">),</span> <span class="mi">1</span><span class="p">)</span>
    <span class="k">return</span> <span class="n">final_losses</span><span class="p">.</span><span class="nf">reshape</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span>
</code></pre></div></div> <p>Next idea is to “train” model on given data to see how much loss changes. If was with DP, not seen many times (and with clipped gradient), so expected loss decrease would be much more than that for some point that has already been seen multiple times. While at it, I also take note of gradient norms</p> <div class="alert alert-info" role="alert"> I tried using the learning rates corresponding to different training mechanisms, turns out that using the same, higher LR, works out best in practice. </div> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">extended_epoch</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">features</span><span class="p">,</span> <span class="n">labels</span><span class="p">,</span> <span class="n">use_dp</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="bp">False</span><span class="p">):</span>
    <span class="n">lr</span> <span class="o">=</span> <span class="mf">0.05</span> <span class="c1"># if use_dp else 0.0005
</span>    <span class="n">criterion</span> <span class="o">=</span> <span class="n">nn</span><span class="p">.</span><span class="nc">CrossEntropyLoss</span><span class="p">(</span><span class="n">reduction</span><span class="o">=</span><span class="sh">'</span><span class="s">none</span><span class="sh">'</span><span class="p">)</span>
    <span class="n">features_collected</span> <span class="o">=</span> <span class="p">[]</span>
    <span class="c1"># Note losses currently
</span>    <span class="n">base_preds</span> <span class="o">=</span> <span class="nf">model</span><span class="p">(</span><span class="n">features</span><span class="p">).</span><span class="nf">detach</span><span class="p">()</span>
    <span class="n">base_losses</span> <span class="o">=</span> <span class="nf">criterion</span><span class="p">(</span><span class="n">base_preds</span><span class="p">,</span> <span class="n">labels</span><span class="p">).</span><span class="nf">cpu</span><span class="p">().</span><span class="nf">numpy</span><span class="p">()</span>
    <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="p">(</span><span class="n">feature</span><span class="p">,</span> <span class="n">label</span><span class="p">)</span> <span class="ow">in</span> <span class="nf">enumerate</span><span class="p">(</span><span class="nf">zip</span><span class="p">(</span><span class="n">features</span><span class="p">,</span> <span class="n">labels</span><span class="p">)):</span>
        <span class="c1"># Make copy of model
</span>        <span class="n">model_</span> <span class="o">=</span> <span class="n">copy</span><span class="p">.</span><span class="nf">deepcopy</span><span class="p">(</span><span class="n">model</span><span class="p">)</span>
        <span class="n">model_</span><span class="p">.</span><span class="nf">train</span><span class="p">()</span>
        <span class="n">model_</span><span class="p">.</span><span class="nf">cuda</span><span class="p">()</span>
        <span class="n">optimizer</span> <span class="o">=</span> <span class="n">ch</span><span class="p">.</span><span class="n">optim</span><span class="p">.</span><span class="nc">SGD</span><span class="p">(</span><span class="n">model_</span><span class="p">.</span><span class="nf">parameters</span><span class="p">(),</span> <span class="n">lr</span><span class="o">=</span><span class="n">lr</span><span class="p">,</span> <span class="n">momentum</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span>
        <span class="n">optimizer</span><span class="p">.</span><span class="nf">zero_grad</span><span class="p">()</span>
        <span class="n">loss</span> <span class="o">=</span> <span class="nf">criterion</span><span class="p">(</span><span class="nf">model_</span><span class="p">(</span><span class="n">ch</span><span class="p">.</span><span class="nf">unsqueeze</span><span class="p">(</span><span class="n">feature</span><span class="p">,</span> <span class="mi">0</span><span class="p">)),</span> <span class="n">ch</span><span class="p">.</span><span class="nf">unsqueeze</span><span class="p">(</span><span class="n">label</span><span class="p">,</span> <span class="mi">0</span><span class="p">))</span>
        <span class="n">loss</span><span class="p">.</span><span class="nf">backward</span><span class="p">()</span>
        <span class="n">optimizer</span><span class="p">.</span><span class="nf">step</span><span class="p">()</span>
        <span class="c1"># Keep track of gradient norms
</span>        <span class="n">gradient_norms</span> <span class="o">=</span> <span class="p">[</span><span class="n">ch</span><span class="p">.</span><span class="n">linalg</span><span class="p">.</span><span class="nf">norm</span><span class="p">(</span><span class="n">x</span><span class="p">.</span><span class="n">grad</span><span class="p">.</span><span class="nf">detach</span><span class="p">().</span><span class="nf">cpu</span><span class="p">()).</span><span class="nf">item</span><span class="p">()</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">model_</span><span class="p">.</span><span class="nf">parameters</span><span class="p">()]</span>
        <span class="c1"># Keep track of updated loss
</span>        <span class="n">loss_new</span> <span class="o">=</span> <span class="nf">criterion</span><span class="p">(</span><span class="nf">model_</span><span class="p">(</span><span class="n">ch</span><span class="p">.</span><span class="nf">unsqueeze</span><span class="p">(</span><span class="n">feature</span><span class="p">,</span> <span class="mi">0</span><span class="p">)),</span> <span class="n">ch</span><span class="p">.</span><span class="nf">unsqueeze</span><span class="p">(</span><span class="n">label</span><span class="p">,</span> <span class="mi">0</span><span class="p">)).</span><span class="nf">detach</span><span class="p">().</span><span class="nf">cpu</span><span class="p">().</span><span class="nf">numpy</span><span class="p">()</span>
        <span class="n">loss_difference</span> <span class="o">=</span> <span class="p">(</span><span class="n">base_losses</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">-</span> <span class="n">loss_new</span><span class="p">).</span><span class="nf">item</span><span class="p">()</span>
        <span class="n">gradient_norms</span> <span class="o">+=</span> <span class="p">[</span><span class="n">loss_difference</span><span class="p">]</span>
        <span class="n">features_collected</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="n">gradient_norms</span><span class="p">)</span>
    <span class="n">features_collected</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">array</span><span class="p">(</span><span class="n">features_collected</span><span class="p">)</span>
    <span class="n">features_collected</span> <span class="o">=</span> <span class="n">features_collected</span><span class="p">.</span><span class="nf">reshape</span><span class="p">(</span><span class="n">features_collected</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span>
    <span class="c1"># Do not care about biases or loss diff
</span>    <span class="n">features_collected</span> <span class="o">=</span> <span class="n">features_collected</span><span class="p">[:,</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">4</span><span class="p">]]</span>
    <span class="c1"># features_collected = np.log(features_collected + 1e-10)
</span>    <span class="k">return</span> <span class="n">features_collected</span><span class="p">.</span><span class="nf">reshape</span><span class="p">(</span><span class="n">features_collected</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span>
</code></pre></div></div> <p>The next feature is inspired by dataset inference<d-cite key="mainidataset"></d-cite>. It takes fixed-size steps in random directions (but same random direction) and keep track of how many steps it takes to flip classification. The main idea here is that members/non-members would have different proximity to decision boundaries and thus would have different statistics for this result.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">blind_walk</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">features</span><span class="p">,</span> <span class="n">labels</span><span class="p">):</span>
    <span class="c1"># Track the number of steps taken until decision flips 
</span>    <span class="c1"># Walk no more than 100 steps, and try 10 different random directions
</span>    <span class="n">num_directions</span> <span class="o">=</span> <span class="mi">10</span>
    <span class="n">num_max_steps</span> <span class="o">=</span> <span class="mi">100</span>
    <span class="n">point_of_failure</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">ones</span><span class="p">((</span><span class="n">num_directions</span><span class="p">,</span> <span class="n">features</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">]))</span> <span class="o">*</span> <span class="n">np</span><span class="p">.</span><span class="n">inf</span>
    <span class="n">std</span> <span class="o">=</span> <span class="mf">0.1</span>
    <span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="nf">range</span><span class="p">(</span><span class="n">num_directions</span><span class="p">):</span>
        <span class="n">noise</span> <span class="o">=</span> <span class="n">ch</span><span class="p">.</span><span class="nf">randn_like</span><span class="p">(</span><span class="n">features</span><span class="p">)</span> <span class="o">*</span> <span class="n">std</span>
        <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nf">range</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="n">num_max_steps</span> <span class="o">+</span> <span class="mi">1</span><span class="p">):</span>
            <span class="n">new_labels</span> <span class="o">=</span> <span class="n">ch</span><span class="p">.</span><span class="nf">argmax</span><span class="p">(</span><span class="nf">model</span><span class="p">(</span><span class="n">features</span> <span class="o">+</span> <span class="n">noise</span> <span class="o">*</span> <span class="n">i</span><span class="p">).</span><span class="nf">detach</span><span class="p">(),</span> <span class="mi">1</span><span class="p">)</span>
            <span class="n">flipped</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">nonzero</span><span class="p">((</span><span class="n">new_labels</span> <span class="o">!=</span> <span class="n">labels</span><span class="p">).</span><span class="nf">cpu</span><span class="p">().</span><span class="nf">numpy</span><span class="p">())[</span><span class="mi">0</span><span class="p">]</span>
            <span class="n">point_of_failure</span><span class="p">[</span><span class="n">j</span><span class="p">][</span><span class="n">flipped</span><span class="p">]</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">minimum</span><span class="p">(</span><span class="n">point_of_failure</span><span class="p">[</span><span class="n">j</span><span class="p">][</span><span class="n">flipped</span><span class="p">],</span> <span class="n">i</span><span class="p">)</span>
    <span class="n">point_of_failure</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">clip</span><span class="p">(</span><span class="n">point_of_failure</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">num_max_steps</span><span class="p">)</span>
    <span class="n">point_of_failure</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">mean</span><span class="p">(</span><span class="n">point_of_failure</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span>
    <span class="k">return</span> <span class="n">point_of_failure</span><span class="p">.</span><span class="nf">reshape</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>
</code></pre></div></div> <p>Now, we collect all of the features described above. Note that I also include an ‘adv’ variant for gradient-ascent that instead looks to maximize loss instead of minimizing it.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">custom_feature_collection</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">features</span><span class="p">,</span> <span class="n">labels</span><span class="p">,</span> <span class="n">use_dp</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="bp">False</span><span class="p">):</span>
    <span class="n">features_collected</span> <span class="o">=</span> <span class="p">[]</span>
    <span class="n">features_collected</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="nf">ascent_recovery</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">features</span><span class="p">,</span> <span class="n">labels</span><span class="p">))</span>
    <span class="n">features_collected</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="nf">ascent_recovery</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">features</span><span class="p">,</span> <span class="n">labels</span><span class="p">,</span> <span class="n">adv</span> <span class="o">=</span> <span class="bp">True</span><span class="p">))</span>
    <span class="n">features_collected</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="nf">extended_epoch</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">features</span><span class="p">,</span> <span class="n">labels</span><span class="p">,</span> <span class="n">use_dp</span> <span class="o">=</span> <span class="n">use_dp</span><span class="p">))</span>
    <span class="n">features_collected</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="nf">relative_log_merlin</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">features</span><span class="p">,</span> <span class="n">labels</span><span class="p">).</span><span class="nf">reshape</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span>
    <span class="n">features_collected</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="nf">get_class_scaled_logits</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">features</span><span class="p">,</span> <span class="n">labels</span><span class="p">).</span><span class="nf">reshape</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span>
    <span class="n">features_collected</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="nf">blind_walk</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">features</span><span class="p">,</span> <span class="n">labels</span><span class="p">).</span><span class="nf">reshape</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span>
    <span class="n">combined_feratures</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">concatenate</span><span class="p">(</span><span class="n">features_collected</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>
    <span class="k">return</span> <span class="n">combined_feratures</span>
</code></pre></div></div> <h3 id="reference-model-features">Reference-model features</h3> <p>The problem structure and distribution of data (see <a href="https://codalab.lisn.upsaclay.fr/competitions/8551#participate">here</a>) suggests that when looking at a given datapoint, it is more likely to have been a member of another randomly selected model, than not being a member. My intention was to somehow utilize this information in the attack.</p> <p>To compare these trends with reference models, I took inspiration from the MATT<d-cite key="sablayrolles2019white"></d-cite> attack, which works by computing gradient alignment between the given model and another reference model. Although the original attack assumes linear models (and cannot work for convolutional layers/deep models), I figured gradient similarity would still be a useful metric to compare models. For any given model, I sample 25 random reference models (out of all 100 train models) and compute gradient similarity between the given model and each reference model.</p> <p>Below, we compute gradient similarity between the given model and some reference model.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">matt_modified_scores</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">features</span><span class="p">,</span> <span class="n">labels</span><span class="p">,</span> <span class="n">model_reference</span><span class="p">):</span>
    <span class="n">criterion</span> <span class="o">=</span> <span class="n">nn</span><span class="p">.</span><span class="nc">CrossEntropyLoss</span><span class="p">(</span><span class="n">reduction</span><span class="o">=</span><span class="sh">'</span><span class="s">none</span><span class="sh">'</span><span class="p">)</span>
    <span class="n">cos</span> <span class="o">=</span> <span class="n">nn</span><span class="p">.</span><span class="nc">CosineSimilarity</span><span class="p">(</span><span class="n">dim</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span>
    <span class="n">features_collected</span> <span class="o">=</span> <span class="p">[]</span>

    <span class="c1"># Make copy of model
</span>    <span class="n">model_</span> <span class="o">=</span> <span class="n">copy</span><span class="p">.</span><span class="nf">deepcopy</span><span class="p">(</span><span class="n">model</span><span class="p">)</span>
    <span class="n">model_</span><span class="p">.</span><span class="nf">cuda</span><span class="p">()</span>

    <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="p">(</span><span class="n">feature</span><span class="p">,</span> <span class="n">label</span><span class="p">)</span> <span class="ow">in</span> <span class="nf">enumerate</span><span class="p">(</span><span class="nf">zip</span><span class="p">(</span><span class="n">features</span><span class="p">,</span> <span class="n">labels</span><span class="p">)):</span>
        <span class="c1"># Compute gradients with both models
</span>        <span class="n">model_</span><span class="p">.</span><span class="nf">zero_grad</span><span class="p">()</span>
        <span class="n">model_reference</span><span class="p">.</span><span class="nf">zero_grad</span><span class="p">()</span>
        <span class="n">loss</span> <span class="o">=</span> <span class="nf">criterion</span><span class="p">(</span><span class="nf">model_</span><span class="p">(</span><span class="n">ch</span><span class="p">.</span><span class="nf">unsqueeze</span><span class="p">(</span><span class="n">feature</span><span class="p">,</span> <span class="mi">0</span><span class="p">)),</span> <span class="n">ch</span><span class="p">.</span><span class="nf">unsqueeze</span><span class="p">(</span><span class="n">label</span><span class="p">,</span> <span class="mi">0</span><span class="p">))</span>
        <span class="n">loss_ref</span> <span class="o">=</span> <span class="nf">criterion</span><span class="p">(</span><span class="nf">model_reference</span><span class="p">(</span><span class="n">ch</span><span class="p">.</span><span class="nf">unsqueeze</span><span class="p">(</span><span class="n">feature</span><span class="p">,</span> <span class="mi">0</span><span class="p">)),</span> <span class="n">ch</span><span class="p">.</span><span class="nf">unsqueeze</span><span class="p">(</span><span class="n">label</span><span class="p">,</span> <span class="mi">0</span><span class="p">))</span>
        <span class="n">loss</span><span class="p">.</span><span class="nf">backward</span><span class="p">()</span>
        <span class="n">loss_ref</span><span class="p">.</span><span class="nf">backward</span><span class="p">()</span>
        
        <span class="c1"># Compute product
</span>        <span class="n">inner_features</span> <span class="o">=</span> <span class="p">[]</span>
        <span class="k">for</span> <span class="n">p1</span><span class="p">,</span> <span class="n">p2</span> <span class="ow">in</span> <span class="nf">zip</span><span class="p">(</span><span class="n">model_</span><span class="p">.</span><span class="nf">parameters</span><span class="p">(),</span> <span class="n">model_reference</span><span class="p">.</span><span class="nf">parameters</span><span class="p">()):</span>
            <span class="n">term</span> <span class="o">=</span> <span class="n">ch</span><span class="p">.</span><span class="nf">dot</span><span class="p">(</span><span class="n">p1</span><span class="p">.</span><span class="n">grad</span><span class="p">.</span><span class="nf">detach</span><span class="p">().</span><span class="nf">flatten</span><span class="p">(),</span> <span class="n">p2</span><span class="p">.</span><span class="n">grad</span><span class="p">.</span><span class="nf">detach</span><span class="p">().</span><span class="nf">flatten</span><span class="p">()).</span><span class="nf">item</span><span class="p">()</span>
            <span class="n">inner_features</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="n">term</span><span class="p">)</span>
        <span class="n">features_collected</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="n">inner_features</span><span class="p">)</span>
    <span class="n">features_collected</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">array</span><span class="p">(</span><span class="n">features_collected</span><span class="p">)</span>
    <span class="c1"># Focus only on weight-related parameters
</span>    <span class="n">features_collected</span> <span class="o">=</span> <span class="n">features_collected</span><span class="p">[:,</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">6</span><span class="p">]]</span>
    <span class="k">return</span> <span class="n">features_collected</span>
</code></pre></div></div> <p>To aggregate gradient alignment information across reference models (since the reference models are randomly selected, so cannot compare across models), I compute the range, mean of absolute values, min, and max of the gradient alignment scores.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">extract_features_for_reference_models</span><span class="p">(</span><span class="n">reference_features</span><span class="p">):</span>
    <span class="n">num_layers_collected</span> <span class="o">=</span> <span class="n">reference_features</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span>
    <span class="n">features</span> <span class="o">=</span> <span class="p">[]</span>
    <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nf">range</span><span class="p">(</span><span class="n">num_layers_collected</span><span class="p">):</span>
        <span class="n">features</span><span class="p">.</span><span class="nf">append</span><span class="p">((</span>
            <span class="n">np</span><span class="p">.</span><span class="nf">max</span><span class="p">(</span><span class="n">reference_features</span><span class="p">[:,</span> <span class="p">:,</span> <span class="n">i</span><span class="p">],</span> <span class="mi">0</span><span class="p">)</span> <span class="o">-</span> <span class="n">np</span><span class="p">.</span><span class="nf">min</span><span class="p">(</span><span class="n">reference_features</span><span class="p">[:,</span> <span class="p">:,</span>  <span class="n">i</span><span class="p">],</span> <span class="mi">0</span><span class="p">),</span>
            <span class="n">np</span><span class="p">.</span><span class="nf">sum</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="nf">abs</span><span class="p">(</span><span class="n">reference_features</span><span class="p">[:,</span> <span class="p">:,</span>  <span class="n">i</span><span class="p">]),</span> <span class="mi">0</span><span class="p">),</span>
            <span class="n">np</span><span class="p">.</span><span class="nf">min</span><span class="p">(</span><span class="n">reference_features</span><span class="p">[:,</span> <span class="p">:,</span>  <span class="n">i</span><span class="p">],</span> <span class="mi">0</span><span class="p">),</span>
            <span class="n">np</span><span class="p">.</span><span class="nf">max</span><span class="p">(</span><span class="n">reference_features</span><span class="p">[:,</span> <span class="p">:,</span>  <span class="n">i</span><span class="p">],</span> <span class="mi">0</span><span class="p">)</span>
        <span class="p">))</span>
    <span class="k">return</span> <span class="n">np</span><span class="p">.</span><span class="nf">concatenate</span><span class="p">(</span><span class="n">features</span><span class="p">,</span> <span class="mi">0</span><span class="p">).</span><span class="n">T</span>
</code></pre></div></div> <p>Let’s start collecting all train models.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">collect_models</span><span class="p">():</span>
    <span class="sh">"""</span><span class="s">
        Collect all models from the </span><span class="sh">'</span><span class="s">train</span><span class="sh">'</span><span class="s"> set
    </span><span class="sh">"""</span>
    <span class="n">CHALLENGE</span> <span class="o">=</span> <span class="sh">"</span><span class="s">cifar10</span><span class="sh">"</span>
    <span class="n">LEN_TRAINING</span> <span class="o">=</span> <span class="mi">50000</span>
    <span class="n">LEN_CHALLENGE</span> <span class="o">=</span> <span class="mi">100</span>
    <span class="n">scenarios</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="nf">listdir</span><span class="p">(</span><span class="n">CHALLENGE</span><span class="p">)</span>

    <span class="n">dataset</span> <span class="o">=</span> <span class="nf">load_cifar10</span><span class="p">(</span><span class="n">dataset_dir</span><span class="o">=</span><span class="sh">"</span><span class="s">/u/as9rw/work/MICO/data</span><span class="sh">"</span><span class="p">)</span>

    <span class="n">collected_models</span> <span class="o">=</span> <span class="p">{</span><span class="n">x</span><span class="p">:[]</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">scenarios</span><span class="p">}</span>
    <span class="n">phase</span> <span class="o">=</span> <span class="sh">"</span><span class="s">train</span><span class="sh">"</span>
    <span class="k">for</span> <span class="n">scenario</span> <span class="ow">in</span> <span class="nf">tqdm</span><span class="p">(</span><span class="n">scenarios</span><span class="p">,</span> <span class="n">desc</span><span class="o">=</span><span class="sh">"</span><span class="s">scenario</span><span class="sh">"</span><span class="p">):</span>
        <span class="n">root</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="n">CHALLENGE</span><span class="p">,</span> <span class="n">scenario</span><span class="p">,</span> <span class="n">phase</span><span class="p">)</span>
        <span class="k">for</span> <span class="n">model_folder</span> <span class="ow">in</span> <span class="nf">tqdm</span><span class="p">(</span><span class="nf">sorted</span><span class="p">(</span><span class="n">os</span><span class="p">.</span><span class="nf">listdir</span><span class="p">(</span><span class="n">root</span><span class="p">),</span> <span class="n">key</span><span class="o">=</span><span class="k">lambda</span> <span class="n">d</span><span class="p">:</span> <span class="nf">int</span><span class="p">(</span><span class="n">d</span><span class="p">.</span><span class="nf">split</span><span class="p">(</span><span class="sh">'</span><span class="s">_</span><span class="sh">'</span><span class="p">)[</span><span class="mi">1</span><span class="p">])),</span> <span class="n">desc</span><span class="o">=</span><span class="sh">"</span><span class="s">model</span><span class="sh">"</span><span class="p">):</span>
            <span class="n">path</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="n">root</span><span class="p">,</span> <span class="n">model_folder</span><span class="p">)</span>
            <span class="n">challenge_dataset</span> <span class="o">=</span> <span class="n">ChallengeDataset</span><span class="p">.</span><span class="nf">from_path</span><span class="p">(</span><span class="n">path</span><span class="p">,</span> <span class="n">dataset</span><span class="o">=</span><span class="n">dataset</span><span class="p">,</span> <span class="n">len_training</span><span class="o">=</span><span class="n">LEN_TRAINING</span><span class="p">)</span>
            <span class="n">challenge_points</span> <span class="o">=</span> <span class="n">challenge_dataset</span><span class="p">.</span><span class="nf">get_challenges</span><span class="p">()</span>
            
            <span class="n">model</span> <span class="o">=</span> <span class="nf">load_model</span><span class="p">(</span><span class="sh">'</span><span class="s">cifar10</span><span class="sh">'</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span>
            <span class="n">collected_models</span><span class="p">[</span><span class="n">scenario</span><span class="p">].</span><span class="nf">append</span><span class="p">(</span><span class="n">model</span><span class="p">)</span>

        <span class="n">collected_models</span><span class="p">[</span><span class="n">scenario</span><span class="p">]</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">array</span><span class="p">(</span><span class="n">collected_models</span><span class="p">[</span><span class="n">scenario</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="nb">object</span><span class="p">)</span>
            
    <span class="k">return</span> <span class="n">collected_models</span>
</code></pre></div></div> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">train_models</span> <span class="o">=</span> <span class="nf">collect_models</span><span class="p">()</span>
</code></pre></div></div> <p>Feature vectors are thus a combination of two kinds of features. The first set (originating from <code class="language-plaintext highlighter-rouge">matt_modified_scores</code>) uses reference models and reference statistics between a given model and these reference models. The second set of features, originating from <code class="language-plaintext highlighter-rouge">custom_feature_collection</code>, does not use any additional reference models.</p> <div class="alert alert-info" role="alert"> <b>For Purchase100:</b> without the reference-model-based features, the attack had a maximum TPR@0.1FPR of $\approx0.13$, and immediately jumped to $0.15$ with the inclusion of those reference-model-based features. Additional experimentation with the meta-classifier itself bumped performance further up to $&gt;0.16$. My experience with these MI attacks has been that <b><u>the choice of meta-classifier also matters</u></b>. Most of the MI-related papers I looked at use LR-based models when dealing with features. It thus might be worthwhile to spend some time on feature engineering (using the same set of raw features) and meta-classifier optimization. </div> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">CHALLENGE</span> <span class="o">=</span> <span class="sh">"</span><span class="s">cifar10</span><span class="sh">"</span>
<span class="n">scenarios</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="nf">listdir</span><span class="p">(</span><span class="n">CHALLENGE</span><span class="p">)</span>
<span class="n">phase</span> <span class="o">=</span> <span class="sh">"</span><span class="s">train</span><span class="sh">"</span>
<span class="n">dataset</span> <span class="o">=</span> <span class="nf">load_cifar10</span><span class="p">(</span><span class="n">dataset_dir</span><span class="o">=</span><span class="sh">"</span><span class="s">/u/as9rw/work/MICO/data</span><span class="sh">"</span><span class="p">)</span>
<span class="n">LEN_TRAINING</span> <span class="o">=</span> <span class="mi">50000</span>
<span class="n">LEN_CHALLENGE</span> <span class="o">=</span> <span class="mi">100</span>

<span class="n">X_for_meta</span><span class="p">,</span> <span class="n">Y_for_meta</span> <span class="o">=</span> <span class="p">{},</span> <span class="p">{}</span>
<span class="n">num_use_others</span> <span class="o">=</span> <span class="mi">25</span> <span class="c1"># 50 worked best, but too slow
</span>
<span class="c1"># Check performance of approach on (1, n-1) models from train
</span><span class="k">for</span> <span class="n">scenario</span> <span class="ow">in</span> <span class="nf">tqdm</span><span class="p">(</span><span class="n">scenarios</span><span class="p">,</span> <span class="n">desc</span><span class="o">=</span><span class="sh">"</span><span class="s">scenario</span><span class="sh">"</span><span class="p">):</span>
    <span class="n">use_dp</span> <span class="o">=</span> <span class="ow">not</span> <span class="n">scenario</span><span class="p">.</span><span class="nf">endswith</span><span class="p">(</span><span class="sh">'</span><span class="s">_inf</span><span class="sh">'</span><span class="p">)</span>
    <span class="n">preds_all</span> <span class="o">=</span> <span class="p">[]</span>
    <span class="n">scores_all</span> <span class="o">=</span> <span class="p">[]</span>
    <span class="n">root</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="n">CHALLENGE</span><span class="p">,</span> <span class="n">scenario</span><span class="p">,</span> <span class="n">phase</span><span class="p">)</span>
    <span class="n">all_except</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">arange</span><span class="p">(</span><span class="mi">100</span><span class="p">)</span>

    <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">model_folder</span> <span class="ow">in</span> <span class="nf">tqdm</span><span class="p">(</span><span class="nf">enumerate</span><span class="p">(</span><span class="nf">sorted</span><span class="p">(</span><span class="n">os</span><span class="p">.</span><span class="nf">listdir</span><span class="p">(</span><span class="n">root</span><span class="p">),</span> <span class="n">key</span><span class="o">=</span><span class="k">lambda</span> <span class="n">d</span><span class="p">:</span> <span class="nf">int</span><span class="p">(</span><span class="n">d</span><span class="p">.</span><span class="nf">split</span><span class="p">(</span><span class="sh">'</span><span class="s">_</span><span class="sh">'</span><span class="p">)[</span><span class="mi">1</span><span class="p">]))),</span> <span class="n">desc</span><span class="o">=</span><span class="sh">"</span><span class="s">model</span><span class="sh">"</span><span class="p">,</span> <span class="n">total</span><span class="o">=</span><span class="mi">100</span><span class="p">):</span>
        <span class="n">path</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="n">root</span><span class="p">,</span> <span class="n">model_folder</span><span class="p">)</span>
        <span class="n">challenge_dataset</span> <span class="o">=</span> <span class="n">ChallengeDataset</span><span class="p">.</span><span class="nf">from_path</span><span class="p">(</span><span class="n">path</span><span class="p">,</span> <span class="n">dataset</span><span class="o">=</span><span class="n">dataset</span><span class="p">,</span> <span class="n">len_training</span><span class="o">=</span><span class="n">LEN_TRAINING</span><span class="p">)</span>
        <span class="n">challenge_points</span> <span class="o">=</span> <span class="n">challenge_dataset</span><span class="p">.</span><span class="nf">get_challenges</span><span class="p">()</span>
        
        <span class="n">challenge_dataloader</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="n">utils</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="nc">DataLoader</span><span class="p">(</span><span class="n">challenge_points</span><span class="p">,</span> <span class="n">batch_size</span><span class="o">=</span><span class="mi">2</span><span class="o">*</span><span class="n">LEN_CHALLENGE</span><span class="p">)</span>
        <span class="n">features</span><span class="p">,</span> <span class="n">labels</span> <span class="o">=</span> <span class="nf">next</span><span class="p">(</span><span class="nf">iter</span><span class="p">(</span><span class="n">challenge_dataloader</span><span class="p">))</span>
        <span class="n">features</span><span class="p">,</span> <span class="n">labels</span> <span class="o">=</span> <span class="n">features</span><span class="p">.</span><span class="nf">cuda</span><span class="p">(),</span> <span class="n">labels</span><span class="p">.</span><span class="nf">cuda</span><span class="p">()</span>
            
        <span class="n">model</span> <span class="o">=</span> <span class="nf">load_model</span><span class="p">(</span><span class="sh">'</span><span class="s">cifar10</span><span class="sh">'</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span>
        <span class="n">model</span><span class="p">.</span><span class="nf">cuda</span><span class="p">()</span>
        <span class="n">features</span><span class="p">,</span> <span class="n">labels</span> <span class="o">=</span> <span class="n">features</span><span class="p">.</span><span class="nf">cuda</span><span class="p">(),</span> <span class="n">labels</span><span class="p">.</span><span class="nf">cuda</span><span class="p">()</span>
        <span class="c1"># Look at all models except this one
</span>        <span class="n">other_models</span> <span class="o">=</span> <span class="n">train_models</span><span class="p">[</span><span class="n">scenario</span><span class="p">][</span><span class="n">np</span><span class="p">.</span><span class="nf">delete</span><span class="p">(</span><span class="n">all_except</span><span class="p">,</span> <span class="n">i</span><span class="p">)]</span>
        <span class="c1"># Pick random models
</span>        <span class="n">other_models</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="nf">choice</span><span class="p">(</span><span class="n">other_models</span><span class="p">,</span> <span class="n">num_use_others</span><span class="p">,</span> <span class="n">replace</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
        <span class="n">other_models</span> <span class="o">=</span> <span class="p">[</span><span class="n">x</span><span class="p">.</span><span class="nf">cuda</span><span class="p">()</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">other_models</span><span class="p">]</span>

        <span class="n">features_collected</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">array</span><span class="p">([</span><span class="nf">matt_modified_scores</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">features</span><span class="p">,</span> <span class="n">labels</span><span class="p">,</span> <span class="n">other_model</span><span class="p">)</span> <span class="k">for</span> <span class="n">other_model</span> <span class="ow">in</span> <span class="n">other_models</span><span class="p">])</span>
        <span class="n">scores</span> <span class="o">=</span> <span class="nf">extract_features_for_reference_models</span><span class="p">(</span><span class="n">features_collected</span><span class="p">)</span>
        <span class="n">other_features</span> <span class="o">=</span> <span class="nf">custom_feature_collection</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">features</span><span class="p">,</span> <span class="n">labels</span><span class="p">,</span> <span class="n">use_dp</span> <span class="o">=</span> <span class="n">use_dp</span><span class="p">)</span>
        <span class="n">scores</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">concatenate</span><span class="p">((</span><span class="n">scores</span><span class="p">,</span> <span class="n">other_features</span><span class="p">),</span> <span class="mi">1</span><span class="p">)</span>

        <span class="n">mem_labels</span> <span class="o">=</span> <span class="n">challenge_dataset</span><span class="p">.</span><span class="nf">get_solutions</span><span class="p">()</span>

        <span class="c1"># Store
</span>        <span class="n">preds_all</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="n">mem_labels</span><span class="p">)</span>
        <span class="n">scores_all</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="n">scores</span><span class="p">)</span>
    
    <span class="n">preds_all</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">concatenate</span><span class="p">(</span><span class="n">preds_all</span><span class="p">)</span>
    <span class="n">scores_all</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">concatenate</span><span class="p">(</span><span class="n">scores_all</span><span class="p">)</span>
    
    <span class="n">X_for_meta</span><span class="p">[</span><span class="n">scenario</span><span class="p">]</span> <span class="o">=</span> <span class="n">scores_all</span>
    <span class="n">Y_for_meta</span><span class="p">[</span><span class="n">scenario</span><span class="p">]</span> <span class="o">=</span> <span class="n">preds_all</span>
</code></pre></div></div> <p>Use <a href="https://automl.github.io/auto-sklearn/master/">auto-sklearn</a> (an automl package does does pipeline, optimizer, and hyper-param optimization for you) worked out best, second to using random-forest classifiers.</p> <div class="alert alert-info" role="alert"> In all scenarios, training classifiers separately for the different scenarios (no, low, high DP) worked better than training a single classifier, even when the scenario is explicitly provided as an input feature (one-hot encoding). </div> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Train different meta-classifiers per scenario
</span><span class="n">CHALLENGE</span> <span class="o">=</span> <span class="sh">"</span><span class="s">cifar10</span><span class="sh">"</span>
<span class="n">scenarios</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="nf">listdir</span><span class="p">(</span><span class="n">CHALLENGE</span><span class="p">)</span>
<span class="n">meta_clfs</span> <span class="o">=</span> <span class="p">{</span><span class="n">x</span><span class="p">:</span> <span class="n">autosklearn</span><span class="p">.</span><span class="n">classification</span><span class="p">.</span><span class="nc">AutoSklearnClassifier</span><span class="p">(</span><span class="n">memory_limit</span><span class="o">=</span><span class="mi">64</span> <span class="o">*</span> <span class="mi">1024</span><span class="p">,</span> <span class="n">time_left_for_this_task</span><span class="o">=</span><span class="mi">180</span><span class="p">,</span> <span class="n">metric</span><span class="o">=</span><span class="n">autosklearn</span><span class="p">.</span><span class="n">metrics</span><span class="p">.</span><span class="n">roc_auc</span><span class="p">)</span> <span class="k">for</span> <span class="n">ii</span><span class="p">,</span> <span class="n">x</span> <span class="ow">in</span> <span class="nf">enumerate</span><span class="p">(</span><span class="n">X_for_meta</span><span class="p">.</span><span class="nf">keys</span><span class="p">())}</span>

<span class="n">avg</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">use_all</span> <span class="o">=</span> <span class="bp">False</span>
<span class="k">for</span> <span class="n">sc</span> <span class="ow">in</span> <span class="n">scenarios</span><span class="p">:</span>
    <span class="n">train_split_og</span><span class="p">,</span> <span class="n">test_split_og</span> <span class="o">=</span> <span class="nf">train_test_split</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="nf">arange</span><span class="p">(</span><span class="mi">100</span><span class="p">),</span> <span class="n">test_size</span><span class="o">=</span><span class="mi">20</span><span class="p">)</span>
    <span class="n">train_split</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">concatenate</span><span class="p">([</span><span class="n">np</span><span class="p">.</span><span class="nf">arange</span><span class="p">(</span><span class="mi">200</span><span class="p">)</span> <span class="o">+</span> <span class="mi">200</span> <span class="o">*</span> <span class="n">i</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">train_split_og</span><span class="p">])</span>
    <span class="n">test_split</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">concatenate</span><span class="p">([</span><span class="n">np</span><span class="p">.</span><span class="nf">arange</span><span class="p">(</span><span class="mi">200</span><span class="p">)</span> <span class="o">+</span> <span class="mi">200</span> <span class="o">*</span> <span class="n">i</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">test_split_og</span><span class="p">])</span>
    <span class="k">if</span> <span class="n">use_all</span><span class="p">:</span>
        <span class="n">X_train</span> <span class="o">=</span> <span class="n">X_for_meta</span><span class="p">[</span><span class="n">sc</span><span class="p">]</span>
        <span class="n">X_test</span>  <span class="o">=</span> <span class="n">X_for_meta</span><span class="p">[</span><span class="n">sc</span><span class="p">]</span>
        <span class="n">y_train</span> <span class="o">=</span> <span class="n">Y_for_meta</span><span class="p">[</span><span class="n">sc</span><span class="p">]</span>
        <span class="n">y_test</span>  <span class="o">=</span> <span class="n">Y_for_meta</span><span class="p">[</span><span class="n">sc</span><span class="p">]</span>
    <span class="k">else</span><span class="p">:</span>
        <span class="n">X_train</span> <span class="o">=</span> <span class="n">X_for_meta</span><span class="p">[</span><span class="n">sc</span><span class="p">][</span><span class="n">train_split</span><span class="p">]</span>
        <span class="n">X_test</span>  <span class="o">=</span> <span class="n">X_for_meta</span><span class="p">[</span><span class="n">sc</span><span class="p">][</span><span class="n">test_split</span><span class="p">]</span>
        <span class="n">y_train</span> <span class="o">=</span> <span class="n">Y_for_meta</span><span class="p">[</span><span class="n">sc</span><span class="p">][</span><span class="n">train_split</span><span class="p">]</span>
        <span class="n">y_test</span>  <span class="o">=</span> <span class="n">Y_for_meta</span><span class="p">[</span><span class="n">sc</span><span class="p">][</span><span class="n">test_split</span><span class="p">]</span>

    <span class="n">meta_clfs</span><span class="p">[</span><span class="n">sc</span><span class="p">].</span><span class="nf">fit</span><span class="p">(</span><span class="n">X_train</span><span class="p">,</span> <span class="n">y_train</span><span class="p">)</span>
    <span class="n">preds</span> <span class="o">=</span> <span class="n">meta_clfs</span><span class="p">[</span><span class="n">sc</span><span class="p">].</span><span class="nf">predict_proba</span><span class="p">(</span><span class="n">X_test</span><span class="p">)[:,</span> <span class="mi">1</span><span class="p">]</span>
    <span class="n">preds_train</span> <span class="o">=</span> <span class="n">meta_clfs</span><span class="p">[</span><span class="n">sc</span><span class="p">].</span><span class="nf">predict_proba</span><span class="p">(</span><span class="n">X_train</span><span class="p">)[:,</span> <span class="mi">1</span><span class="p">]</span>
    
    <span class="nf">print</span><span class="p">(</span><span class="sa">f</span><span class="sh">"</span><span class="si">{</span><span class="n">sc</span><span class="si">}</span><span class="s"> AUC (train): </span><span class="si">{</span><span class="nf">roc_auc_score</span><span class="p">(</span><span class="n">y_train</span><span class="p">,</span> <span class="n">preds_train</span><span class="p">)</span><span class="si">}</span><span class="sh">"</span><span class="p">)</span>
    <span class="n">scores</span> <span class="o">=</span> <span class="nf">score</span><span class="p">(</span><span class="n">y_test</span><span class="p">,</span> <span class="n">preds</span><span class="p">)</span>
    <span class="n">scores</span><span class="p">.</span><span class="nf">pop</span><span class="p">(</span><span class="sh">'</span><span class="s">fpr</span><span class="sh">'</span><span class="p">,</span> <span class="bp">None</span><span class="p">)</span>
    <span class="n">scores</span><span class="p">.</span><span class="nf">pop</span><span class="p">(</span><span class="sh">'</span><span class="s">tpr</span><span class="sh">'</span><span class="p">,</span> <span class="bp">None</span><span class="p">)</span>
    <span class="nf">display</span><span class="p">(</span><span class="n">pd</span><span class="p">.</span><span class="nc">DataFrame</span><span class="p">([</span><span class="n">scores</span><span class="p">]))</span>
    <span class="n">avg</span> <span class="o">+=</span> <span class="n">scores</span><span class="p">[</span><span class="sh">'</span><span class="s">TPR_FPR_1000</span><span class="sh">'</span><span class="p">]</span>

<span class="nf">print</span><span class="p">(</span><span class="sh">"</span><span class="s">Average score</span><span class="sh">"</span><span class="p">,</span> <span class="n">avg</span> <span class="o">/</span> <span class="mi">3</span><span class="p">)</span>
</code></pre></div></div> <p>For submissions, I set <code class="language-plaintext highlighter-rouge">use_all</code> to True (since the autoML classifier already does train-val splits internally). The train-test split above is just for visualization purposes (and to see how well the classifier performs on val data).</p> <h2 id="feature-inspection">Feature Inspection</h2> <p>A closer look (via permutation importance) at the features shows some interesting trends.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="n">sklearn.inspection</span> <span class="kn">import</span> <span class="n">plot_partial_dependence</span><span class="p">,</span> <span class="n">permutation_importance</span>

<span class="n">labels</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nf">range</span><span class="p">(</span><span class="mi">4</span><span class="p">):</span>
    <span class="n">labels</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="sa">f</span><span class="sh">"</span><span class="s">range_</span><span class="si">{</span><span class="n">i</span><span class="o">+</span><span class="mi">1</span><span class="si">}</span><span class="sh">"</span><span class="p">)</span>
    <span class="n">labels</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="sa">f</span><span class="sh">"</span><span class="s">abssum_</span><span class="si">{</span><span class="n">i</span><span class="o">+</span><span class="mi">1</span><span class="si">}</span><span class="sh">"</span><span class="p">)</span>
    <span class="n">labels</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="sa">f</span><span class="sh">"</span><span class="s">min_</span><span class="si">{</span><span class="n">i</span><span class="o">+</span><span class="mi">1</span><span class="si">}</span><span class="sh">"</span><span class="p">)</span>
    <span class="n">labels</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="sa">f</span><span class="sh">"</span><span class="s">max_</span><span class="si">{</span><span class="n">i</span><span class="o">+</span><span class="mi">1</span><span class="si">}</span><span class="sh">"</span><span class="p">)</span>
<span class="n">labels</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="sh">"</span><span class="s">ascent_loss</span><span class="sh">"</span><span class="p">)</span>
<span class="n">labels</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="sh">"</span><span class="s">ascent_diff</span><span class="sh">"</span><span class="p">)</span>
<span class="n">labels</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="sh">"</span><span class="s">adv_ascent_loss</span><span class="sh">"</span><span class="p">)</span>
<span class="n">labels</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="sh">"</span><span class="s">adv_ascent_diff</span><span class="sh">"</span><span class="p">)</span>
<span class="n">labels</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="sh">"</span><span class="s">ext_epo_1</span><span class="sh">"</span><span class="p">)</span>
<span class="n">labels</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="sh">"</span><span class="s">ext_epo_2</span><span class="sh">"</span><span class="p">)</span>
<span class="n">labels</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="sh">"</span><span class="s">ext_epo_3</span><span class="sh">"</span><span class="p">)</span>
<span class="n">labels</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="sh">"</span><span class="s">merlin</span><span class="sh">"</span><span class="p">)</span>
<span class="n">labels</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="sh">"</span><span class="s">lira</span><span class="sh">"</span><span class="p">)</span>
<span class="n">labels</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="sh">"</span><span class="s">blindwalk</span><span class="sh">"</span><span class="p">)</span>
</code></pre></div></div> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="n">scenario</span> <span class="o">=</span> <span class="sh">"</span><span class="s">cifar10_inf</span><span class="sh">"</span>
<span class="n">r</span> <span class="o">=</span> <span class="nf">permutation_importance</span><span class="p">(</span><span class="n">meta_clfs</span><span class="p">[</span><span class="n">scenario</span><span class="p">],</span> <span class="n">X_for_meta</span><span class="p">[</span><span class="n">scenario</span><span class="p">],</span> <span class="n">Y_for_meta</span><span class="p">[</span><span class="n">scenario</span><span class="p">],</span> <span class="n">n_repeats</span><span class="o">=</span><span class="mi">10</span><span class="p">,</span> <span class="n">random_state</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span>
<span class="n">sort_idx</span> <span class="o">=</span> <span class="n">r</span><span class="p">.</span><span class="n">importances_mean</span><span class="p">.</span><span class="nf">argsort</span><span class="p">()[::</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span>

<span class="n">plt</span><span class="p">.</span><span class="nf">boxplot</span><span class="p">(</span>
    <span class="n">r</span><span class="p">.</span><span class="n">importances</span><span class="p">[</span><span class="n">sort_idx</span><span class="p">].</span><span class="n">T</span><span class="p">,</span> <span class="n">labels</span><span class="o">=</span><span class="p">[</span><span class="n">labels</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">sort_idx</span><span class="p">]</span>
<span class="p">)</span>
    
<span class="n">plt</span><span class="p">.</span><span class="nf">xticks</span><span class="p">(</span><span class="n">rotation</span><span class="o">=</span><span class="mi">90</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="nf">tight_layout</span><span class="p">()</span>
<span class="n">plt</span><span class="p">.</span><span class="nf">show</span><span class="p">()</span>

<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">sort_idx</span><span class="p">[::</span><span class="o">-</span><span class="mi">1</span><span class="p">]:</span>
    <span class="nf">print</span><span class="p">(</span>
        <span class="sa">f</span><span class="sh">"</span><span class="s">[</span><span class="si">{</span><span class="n">i</span><span class="si">}</span><span class="s">] </span><span class="si">{</span><span class="n">labels</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="si">:</span><span class="mi">10</span><span class="n">s</span><span class="si">}</span><span class="s">: </span><span class="si">{</span><span class="n">r</span><span class="p">.</span><span class="n">importances_mean</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="si">:</span><span class="p">.</span><span class="mi">3</span><span class="n">f</span><span class="si">}</span><span class="s"> +/- </span><span class="sh">"</span>
        <span class="sa">f</span><span class="sh">"</span><span class="si">{</span><span class="n">r</span><span class="p">.</span><span class="n">importances_std</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="si">:</span><span class="p">.</span><span class="mi">3</span><span class="n">f</span><span class="si">}</span><span class="sh">"</span>
    <span class="p">)</span>
</code></pre></div></div> <p>The abs-sum and range of gradient alignment values across reference models seem to be most useful. Features like LiRA and gradient norms in the extended-epoch simulation also seem to be important for the case of no DP.</p> <p><img src="assets/img/mico/cifar10_inf.png" alt="png"/></p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>...
[20] ext_epo_1 : 0.010 +/- 0.001
[0] range_1   : 0.013 +/- 0.001
[15] max_4     : 0.016 +/- 0.003
[13] abssum_4  : 0.026 +/- 0.003
[5] abssum_2  : 0.028 +/- 0.003
[24] lira      : 0.045 +/- 0.002
[9] abssum_3  : 0.047 +/- 0.003
[21] ext_epo_2 : 0.054 +/- 0.003
[22] ext_epo_3 : 0.064 +/- 0.003
</code></pre></div></div> <p>However, the same trend does not hold when looking at models with DP.</p> <p><img src="assets/img/mico/cifar10_hi.png" alt="png"/></p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>...
[22] ext_epo_3 : 0.018 +/- 0.002
[1] abssum_1  : 0.020 +/- 0.003
[12] range_4   : 0.020 +/- 0.003
[0] range_1   : 0.020 +/- 0.002
[13] abssum_4  : 0.023 +/- 0.003
[5] abssum_2  : 0.024 +/- 0.003
[9] abssum_3  : 0.027 +/- 0.002
[24] lira      : 0.053 +/- 0.003
</code></pre></div></div> <p>In this case, most of the classifier’s performance seems to originate from LiRA, followed by features extracted from reference models. However, features like the one extracted from the extended-epoch simulation seem to be less somewhat important - in fact, more than they are in the case of no DP.</p> <p>Interestingly, for the case of low-$\epsilon$ DP, merlin-based features turn out to be the most important.</p> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/mico/cifar10_lo-480.webp 480w,/assets/img/mico/cifar10_lo-800.webp 800w,/assets/img/mico/cifar10_lo-1400.webp 1400w," type="image/webp" sizes="95vw"/> <img src="/assets/img/mico/cifar10_lo.png" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <p><img src="assets/img/mico/cifar10_lo.png" alt="png"/></p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>...
[15] max_4     : 0.012 +/- 0.002
[1] abssum_1  : 0.014 +/- 0.003
[14] min_4     : 0.014 +/- 0.002
[24] lira      : 0.016 +/- 0.002
[8] range_3   : 0.017 +/- 0.003
[20] ext_epo_1 : 0.027 +/- 0.004
[23] merlin    : 0.028 +/- 0.002
</code></pre></div></div> <p>The utility of each of these features, though, is significantly lower than scenarios with higher-$\epsilon$ DP, or no DP at all. This is understandable, since the DP is very much intended to reduce membership inference risk. However, it is interesting that the relative utility of features is not same across the scenarios, further showing why having individual meta-classifiers for the scenarios is useful.</p> <hr/> <p>Next, we collect features for all models across scenarios and phases. Helps to re-use features when experiment with different meta-classifiers. No need to re-run for ‘train’ models, since we already have features for those.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">CHALLENGE</span> <span class="o">=</span> <span class="sh">"</span><span class="s">cifar10</span><span class="sh">"</span>
<span class="n">LEN_TRAINING</span> <span class="o">=</span> <span class="mi">50000</span>
<span class="n">LEN_CHALLENGE</span> <span class="o">=</span> <span class="mi">100</span>

<span class="n">scenarios</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="nf">listdir</span><span class="p">(</span><span class="n">CHALLENGE</span><span class="p">)</span>
<span class="n">phases</span> <span class="o">=</span> <span class="p">[</span><span class="sh">'</span><span class="s">dev</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">final</span><span class="sh">'</span><span class="p">]</span>
<span class="n">stored_features</span> <span class="o">=</span> <span class="p">{}</span>

<span class="n">dataset</span> <span class="o">=</span> <span class="nf">load_cifar10</span><span class="p">(</span><span class="n">dataset_dir</span><span class="o">=</span><span class="sh">"</span><span class="s">/u/as9rw/work/MICO/data</span><span class="sh">"</span><span class="p">)</span>

<span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">scenario</span> <span class="ow">in</span> <span class="nf">tqdm</span><span class="p">(</span><span class="nf">enumerate</span><span class="p">(</span><span class="n">scenarios</span><span class="p">),</span> <span class="n">desc</span><span class="o">=</span><span class="sh">"</span><span class="s">scenario</span><span class="sh">"</span><span class="p">,</span> <span class="n">total</span><span class="o">=</span><span class="mi">3</span><span class="p">):</span>
    <span class="n">use_dp</span> <span class="o">=</span> <span class="ow">not</span> <span class="n">scenario</span><span class="p">.</span><span class="nf">endswith</span><span class="p">(</span><span class="sh">'</span><span class="s">_inf</span><span class="sh">'</span><span class="p">)</span>
    <span class="n">stored_features</span><span class="p">[</span><span class="n">scenario</span><span class="p">]</span> <span class="o">=</span> <span class="p">{}</span>
    <span class="k">for</span> <span class="n">phase</span> <span class="ow">in</span> <span class="nf">tqdm</span><span class="p">(</span><span class="n">phases</span><span class="p">,</span> <span class="n">desc</span><span class="o">=</span><span class="sh">"</span><span class="s">phase</span><span class="sh">"</span><span class="p">):</span>
        <span class="n">stored_features</span><span class="p">[</span><span class="n">scenario</span><span class="p">][</span><span class="n">phase</span><span class="p">]</span> <span class="o">=</span> <span class="p">[]</span>
        <span class="n">root</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="n">CHALLENGE</span><span class="p">,</span> <span class="n">scenario</span><span class="p">,</span> <span class="n">phase</span><span class="p">)</span>
        <span class="n">all_except</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">arange</span><span class="p">(</span><span class="mi">100</span><span class="p">)</span>
        <span class="k">for</span> <span class="n">j</span><span class="p">,</span> <span class="n">model_folder</span> <span class="ow">in</span> <span class="nf">tqdm</span><span class="p">(</span><span class="nf">enumerate</span><span class="p">(</span><span class="nf">sorted</span><span class="p">(</span><span class="n">os</span><span class="p">.</span><span class="nf">listdir</span><span class="p">(</span><span class="n">root</span><span class="p">),</span> <span class="n">key</span><span class="o">=</span><span class="k">lambda</span> <span class="n">d</span><span class="p">:</span> <span class="nf">int</span><span class="p">(</span><span class="n">d</span><span class="p">.</span><span class="nf">split</span><span class="p">(</span><span class="sh">'</span><span class="s">_</span><span class="sh">'</span><span class="p">)[</span><span class="mi">1</span><span class="p">]))),</span> <span class="n">desc</span><span class="o">=</span><span class="sh">"</span><span class="s">model</span><span class="sh">"</span><span class="p">,</span> <span class="n">total</span><span class="o">=</span><span class="nf">len</span><span class="p">(</span><span class="n">os</span><span class="p">.</span><span class="nf">listdir</span><span class="p">(</span><span class="n">root</span><span class="p">))):</span>
            <span class="n">path</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="n">root</span><span class="p">,</span> <span class="n">model_folder</span><span class="p">)</span>
            <span class="n">challenge_dataset</span> <span class="o">=</span> <span class="n">ChallengeDataset</span><span class="p">.</span><span class="nf">from_path</span><span class="p">(</span><span class="n">path</span><span class="p">,</span> <span class="n">dataset</span><span class="o">=</span><span class="n">dataset</span><span class="p">,</span> <span class="n">len_training</span><span class="o">=</span><span class="n">LEN_TRAINING</span><span class="p">)</span>
            <span class="n">challenge_points</span> <span class="o">=</span> <span class="n">challenge_dataset</span><span class="p">.</span><span class="nf">get_challenges</span><span class="p">()</span>
            
            <span class="n">model</span> <span class="o">=</span> <span class="nf">load_model</span><span class="p">(</span><span class="sh">'</span><span class="s">cifar10</span><span class="sh">'</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span>
            <span class="n">challenge_dataloader</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="n">utils</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="nc">DataLoader</span><span class="p">(</span><span class="n">challenge_points</span><span class="p">,</span> <span class="n">batch_size</span><span class="o">=</span><span class="mi">2</span><span class="o">*</span><span class="n">LEN_CHALLENGE</span><span class="p">)</span>
            <span class="n">features</span><span class="p">,</span> <span class="n">labels</span> <span class="o">=</span> <span class="nf">next</span><span class="p">(</span><span class="nf">iter</span><span class="p">(</span><span class="n">challenge_dataloader</span><span class="p">))</span>

            <span class="n">model</span><span class="p">.</span><span class="nf">cuda</span><span class="p">()</span>
            <span class="n">features</span><span class="p">,</span> <span class="n">labels</span> <span class="o">=</span> <span class="n">features</span><span class="p">.</span><span class="nf">cuda</span><span class="p">(),</span> <span class="n">labels</span><span class="p">.</span><span class="nf">cuda</span><span class="p">()</span>
            <span class="c1"># Look at all models except this one
</span>            <span class="n">other_models</span> <span class="o">=</span> <span class="n">train_models</span><span class="p">[</span><span class="n">scenario</span><span class="p">][</span><span class="n">np</span><span class="p">.</span><span class="nf">delete</span><span class="p">(</span><span class="n">all_except</span><span class="p">,</span> <span class="n">j</span><span class="p">)]</span>
            <span class="n">other_models</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="nf">choice</span><span class="p">(</span><span class="n">other_models</span><span class="p">,</span> <span class="n">num_use_others</span><span class="p">,</span> <span class="n">replace</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
            <span class="n">features_collected</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">array</span><span class="p">([</span><span class="nf">matt_modified_scores</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">features</span><span class="p">,</span> <span class="n">labels</span><span class="p">,</span> <span class="n">other_model</span><span class="p">.</span><span class="nf">cuda</span><span class="p">())</span> <span class="k">for</span> <span class="n">other_model</span> <span class="ow">in</span> <span class="n">other_models</span><span class="p">])</span>
            <span class="n">scores</span> <span class="o">=</span> <span class="nf">extract_features_for_reference_models</span><span class="p">(</span><span class="n">features_collected</span><span class="p">)</span>
            <span class="n">other_features</span> <span class="o">=</span> <span class="nf">custom_feature_collection</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">features</span><span class="p">,</span> <span class="n">labels</span><span class="p">,</span> <span class="n">use_dp</span><span class="o">=</span><span class="n">use_dp</span><span class="p">)</span>
            <span class="n">processed_features</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">concatenate</span><span class="p">((</span><span class="n">scores</span><span class="p">,</span> <span class="n">other_features</span><span class="p">),</span> <span class="mi">1</span><span class="p">)</span>

            <span class="n">stored_features</span><span class="p">[</span><span class="n">scenario</span><span class="p">][</span><span class="n">phase</span><span class="p">].</span><span class="nf">append</span><span class="p">(</span><span class="n">processed_features</span><span class="p">)</span>
</code></pre></div></div> <p>Time to generate predictions!</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">CHALLENGE</span> <span class="o">=</span> <span class="sh">"</span><span class="s">cifar10</span><span class="sh">"</span>
<span class="n">LEN_TRAINING</span> <span class="o">=</span> <span class="mi">50000</span>
<span class="n">LEN_CHALLENGE</span> <span class="o">=</span> <span class="mi">100</span>

<span class="n">scenarios</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="nf">listdir</span><span class="p">(</span><span class="n">CHALLENGE</span><span class="p">)</span>
<span class="n">phases</span> <span class="o">=</span> <span class="p">[</span><span class="sh">'</span><span class="s">dev</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">final</span><span class="sh">'</span><span class="p">]</span>

<span class="n">dataset</span> <span class="o">=</span> <span class="nf">load_cifar10</span><span class="p">(</span><span class="n">dataset_dir</span><span class="o">=</span><span class="sh">"</span><span class="s">/u/as9rw/work/MICO/data</span><span class="sh">"</span><span class="p">)</span>

<span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">scenario</span> <span class="ow">in</span> <span class="nf">tqdm</span><span class="p">(</span><span class="nf">enumerate</span><span class="p">(</span><span class="n">scenarios</span><span class="p">),</span> <span class="n">desc</span><span class="o">=</span><span class="sh">"</span><span class="s">scenario</span><span class="sh">"</span><span class="p">,</span> <span class="n">total</span><span class="o">=</span><span class="mi">3</span><span class="p">):</span>
    <span class="k">for</span> <span class="n">phase</span> <span class="ow">in</span> <span class="nf">tqdm</span><span class="p">(</span><span class="n">phases</span><span class="p">,</span> <span class="n">desc</span><span class="o">=</span><span class="sh">"</span><span class="s">phase</span><span class="sh">"</span><span class="p">):</span>
        <span class="n">root</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="n">CHALLENGE</span><span class="p">,</span> <span class="n">scenario</span><span class="p">,</span> <span class="n">phase</span><span class="p">)</span>
        <span class="n">j</span> <span class="o">=</span> <span class="mi">0</span>
        <span class="k">for</span> <span class="n">model_folder</span> <span class="ow">in</span> <span class="nf">tqdm</span><span class="p">(</span><span class="nf">sorted</span><span class="p">(</span><span class="n">os</span><span class="p">.</span><span class="nf">listdir</span><span class="p">(</span><span class="n">root</span><span class="p">),</span> <span class="n">key</span><span class="o">=</span><span class="k">lambda</span> <span class="n">d</span><span class="p">:</span> <span class="nf">int</span><span class="p">(</span><span class="n">d</span><span class="p">.</span><span class="nf">split</span><span class="p">(</span><span class="sh">'</span><span class="s">_</span><span class="sh">'</span><span class="p">)[</span><span class="mi">1</span><span class="p">])),</span> <span class="n">desc</span><span class="o">=</span><span class="sh">"</span><span class="s">model</span><span class="sh">"</span><span class="p">):</span>
            <span class="n">path</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="n">root</span><span class="p">,</span> <span class="n">model_folder</span><span class="p">)</span>
            <span class="n">features_use</span> <span class="o">=</span> <span class="n">stored_features</span><span class="p">[</span><span class="n">scenario</span><span class="p">][</span><span class="n">phase</span><span class="p">][</span><span class="n">j</span><span class="p">]</span>
            <span class="c1"># Using scenario-wise meta-classifier
</span>            <span class="n">predictions</span> <span class="o">=</span> <span class="n">meta_clfs</span><span class="p">[</span><span class="n">scenario</span><span class="p">].</span><span class="nf">predict_proba</span><span class="p">(</span><span class="n">features_use</span><span class="p">)[:,</span> <span class="mi">1</span><span class="p">]</span>
            <span class="n">j</span> <span class="o">+=</span> <span class="mi">1</span>
            <span class="k">assert</span> <span class="n">np</span><span class="p">.</span><span class="nf">all</span><span class="p">((</span><span class="mi">0</span> <span class="o">&lt;=</span> <span class="n">predictions</span><span class="p">)</span> <span class="o">&amp;</span> <span class="p">(</span><span class="n">predictions</span> <span class="o">&lt;=</span> <span class="mi">1</span><span class="p">))</span>

            <span class="k">with</span> <span class="nf">open</span><span class="p">(</span><span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="n">path</span><span class="p">,</span> <span class="sh">"</span><span class="s">prediction.csv</span><span class="sh">"</span><span class="p">),</span> <span class="sh">"</span><span class="s">w</span><span class="sh">"</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
                 <span class="n">csv</span><span class="p">.</span><span class="nf">writer</span><span class="p">(</span><span class="n">f</span><span class="p">).</span><span class="nf">writerow</span><span class="p">(</span><span class="n">predictions</span><span class="p">)</span>
</code></pre></div></div> <p>Packaging the submission: now we can store the predictions into a zip file, which you can submit to CodaLab.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="n">zipfile</span>

<span class="n">phases</span> <span class="o">=</span> <span class="p">[</span><span class="sh">'</span><span class="s">dev</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">final</span><span class="sh">'</span><span class="p">]</span>
<span class="n">experiment_name</span> <span class="o">=</span> <span class="sh">"</span><span class="s">final_submission</span><span class="sh">"</span>

<span class="k">with</span> <span class="n">zipfile</span><span class="p">.</span><span class="nc">ZipFile</span><span class="p">(</span><span class="sa">f</span><span class="sh">"</span><span class="s">submissions_cifar/</span><span class="si">{</span><span class="n">experiment_name</span><span class="si">}</span><span class="s">.zip</span><span class="sh">"</span><span class="p">,</span> <span class="sh">'</span><span class="s">w</span><span class="sh">'</span><span class="p">)</span> <span class="k">as</span> <span class="n">zipf</span><span class="p">:</span>
    <span class="k">for</span> <span class="n">scenario</span> <span class="ow">in</span> <span class="nf">tqdm</span><span class="p">(</span><span class="n">scenarios</span><span class="p">,</span> <span class="n">desc</span><span class="o">=</span><span class="sh">"</span><span class="s">scenario</span><span class="sh">"</span><span class="p">):</span> 
        <span class="k">for</span> <span class="n">phase</span> <span class="ow">in</span> <span class="nf">tqdm</span><span class="p">(</span><span class="n">phases</span><span class="p">,</span> <span class="n">desc</span><span class="o">=</span><span class="sh">"</span><span class="s">phase</span><span class="sh">"</span><span class="p">):</span>
            <span class="n">root</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="n">CHALLENGE</span><span class="p">,</span> <span class="n">scenario</span><span class="p">,</span> <span class="n">phase</span><span class="p">)</span>
            <span class="k">for</span> <span class="n">model_folder</span> <span class="ow">in</span> <span class="nf">tqdm</span><span class="p">(</span><span class="nf">sorted</span><span class="p">(</span><span class="n">os</span><span class="p">.</span><span class="nf">listdir</span><span class="p">(</span><span class="n">root</span><span class="p">),</span> <span class="n">key</span><span class="o">=</span><span class="k">lambda</span> <span class="n">d</span><span class="p">:</span> <span class="nf">int</span><span class="p">(</span><span class="n">d</span><span class="p">.</span><span class="nf">split</span><span class="p">(</span><span class="sh">'</span><span class="s">_</span><span class="sh">'</span><span class="p">)[</span><span class="mi">1</span><span class="p">])),</span> <span class="n">desc</span><span class="o">=</span><span class="sh">"</span><span class="s">model</span><span class="sh">"</span><span class="p">):</span>
                <span class="n">path</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="n">root</span><span class="p">,</span> <span class="n">model_folder</span><span class="p">)</span>
                <span class="nb">file</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="n">path</span><span class="p">,</span> <span class="sh">"</span><span class="s">prediction.csv</span><span class="sh">"</span><span class="p">)</span>
                <span class="k">if</span> <span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="nf">exists</span><span class="p">(</span><span class="nb">file</span><span class="p">):</span>
                    <span class="n">zipf</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span><span class="nb">file</span><span class="p">)</span>
                <span class="k">else</span><span class="p">:</span>
                    <span class="k">raise</span> <span class="nc">FileNotFoundError</span><span class="p">(</span><span class="sa">f</span><span class="sh">"</span><span class="s">`prediction.csv` not found in </span><span class="si">{</span><span class="n">path</span><span class="si">}</span><span class="s">. You need to provide predictions for all challenges</span><span class="sh">"</span><span class="p">)</span>
</code></pre></div></div> <h2 id="takeaways">Takeaways</h2> <p>This was my first stint with actively developing and launching a membership inference attack, and I definitely learned a lot! I can now see that membership inference is <strong>hard</strong>, even when access to a highly-overlapping dataset (and models trained on similar data) are provided. I also learned that while using observation-based features directly (such as LiRA, scaled to the $[0, 1]$ scale) is commonpractice in MI literature, switching to better classifiers than linear regression (better yet- automating the entire model pipeline) can give non-trivial boosts in performance. I hope this post was helpful to you, and I’d love to hear your thoughts and feedback!</p> <p>My solutions (for both CIFAR and Purchase100) are available here: <a href="https://github.com/iamgroot42/MICO">https://github.com/iamgroot42/MICO</a></p>]]></content><author><name>Anshuman Suri</name></author><category term="competition"/><category term="membership inference"/><category term="privacy"/><category term="mico"/><summary type="html"><![CDATA[Description of my entry to the MICO challenge (co-located with SaTML) for membership inference that won me the 2nd place on the CIFAR track.]]></summary></entry><entry><title type="html">Dissecting Distribution Inference</title><link href="https://anshumansuri.com/blog/2022/ddi/" rel="alternate" type="text/html" title="Dissecting Distribution Inference"/><published>2022-12-15T00:00:00+00:00</published><updated>2022-12-15T00:00:00+00:00</updated><id>https://anshumansuri.com/blog/2022/ddi</id><content type="html" xml:base="https://anshumansuri.com/blog/2022/ddi/"><![CDATA[<p>Distribution inference attacks aims to infer statistical properties of data used to train machine learning models. These attacks are sometimes surprisingly potent, as we demonstrated in <a href="https://uvasrg.github.io/on-the-risks-of-distribution-inference/">previous work</a>.</p> <h2 id="kl-divergence-attack">KL Divergence Attack</h2> <p>Most attacks against distribution inference involve training a meta-classifier, either using model parameters in white-box settings <d-cite key="ganju2018property"></d-cite>, or using model predictions in black-box scenarios <d-cite key="zhang2021leakage"></d-cite>. While other black-box were proposed in our prior work, they are not as accurate as meta-classifier-based methods, and require training shadow models nonetheless <d-cite key="suri2022formalizing"></d-cite>.</p> <p>We propose a new attack: the KL Divergence Attack. Using some sample of data, the adversary computes predictions on local models from both distributions as well as the victim’s model. Then, it uses the prediction probabilities to compute KL divergence between the victim’s models and the local models to make its predictions. Our attack outperforms even the current state-of-the-art white-box attacks.</p> <p>We observe several interesting trends across our experiments. One striking example is that with varying task-property correlation. While intuition suggests increasing inference leakage with increasing correlation between the classifier’s task and the property being inferred, we observe no such trend:</p> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/ddi/correlation_box-480.webp 480w,/assets/img/ddi/correlation_box-800.webp 800w,/assets/img/ddi/correlation_box-1400.webp 1400w," type="image/webp" sizes="95vw"/> <img src="/assets/img/ddi/correlation_box.png" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Distinguishing accuracy for different task-property pairs for Celeb-A dataset for varying correlation. Task-property correlations are: $\approx 0$ (Mouth Slightly Open-Wavy Hair), $\approx 0.14$ (Smiling-Female), $\approx 0.28$ (Female-Young), and $\approx 0.42$ (Mouth Slightly Open-High Cheekbones). </div> <h2 id="impact-of-adversarys-knowledge">Impact of adversary’s knowledge</h2> <p>We evaluate inference risk while relaxing a variety of implicit assumptions of the adversary;s knowledge in black-box setups. Concretely, we evaluate label-only API access settings, different victim-adversary feature extractors, and different victim-adversary model architectures.</p> <table> <tr> <th rowspan="2"> Victim Model </th> <th colspan="4"> Adversary Model </th> </tr> <tr> <th> RF </th> <th> LR </th> <th> MLP$_2$ </th> <th> MLP$_3$ </th> </tr> <tr> <td>Random Forest (RF)</td> <td> 12.0 </td> <td> 1.7 </td> <td> 5.4 </td> <td> 4.9 </td> </tr> <tr> <td>Linear Regression (LR)</td> <td> 13.5 </td> <td> 25.9 </td> <td> 3.7 </td> <td> 5.4 </td> </tr> <tr> <td>Two-layer perceptron (MLP$_2$)</td> <td> 0.9 </td> <td> 0.3 </td> <td> 4.2 </td> <td> 4.3 </td> </tr> <tr> <td>Three-layer perceptron (MLP$_3$)</td> <td> 0.2 </td> <td> 0.3 </td> <td> 4.0 </td> <td> 3.8 </td> </tr> </table> <p>Consider inference leakage for the Census19 dataset (table above with mean $n_{leaked}$ values) as an example. Inference risk is significantly higher when the adversary uses models with learning capacity similar to the victim, like both using one of (MLP$_2$, MLP$_3$) or (RF, MLP). Interestingly though, we also observe a sharp increase in inference risk when the victim uses models with low capacity, like LR and RF instead of multi-layer perceptrons.</p> <h2 id="defenses">Defenses</h2> <p>Finally, we evaluate the effectiveness of some empirical defenses, most of which add noise to the training process.</p> <p>For instance while inference leakage reduces when the victim utilizes DP, most of the drop in effectiveness comes from a mismatch in the victim’s and adversary’s training environments:</p> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/ddi/dp_box-480.webp 480w,/assets/img/ddi/dp_box-800.webp 800w,/assets/img/ddi/dp_box-1400.webp 1400w," type="image/webp" sizes="95vw"/> <img src="/assets/img/ddi/dp_box.png" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Distinguishing accuracy for different for Census19 (Sex). Attack accuracy drops with stronger DP guarantees i.e. decreasing privacy budget $\epsilon$. </div> <p>Compared to an adversary that does not use DP, there is a clear increase in inference risk (mean $n_{leaked}$ increases to 2.9 for $\epsilon=1.0$, and 4.8 for $\epsilon=0.12$ compared to 4.2 without any DP noise).</p> <p>Our exploration of potential defenses also reveals a strong connection between model generalization and inference risk (as apparent below, for the case of Celeb-A), suggesting that the defenses that do seem to work are attributable to poor model performance, and not something special about the defense itself (like adversarial training or label noise). &lt;/br&gt;</p> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/ddi/generalization_curve-480.webp 480w,/assets/img/ddi/generalization_curve-800.webp 800w,/assets/img/ddi/generalization_curve-1400.webp 1400w," type="image/webp" sizes="95vw"/> <img src="/assets/img/ddi/generalization_curve.png" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Mean distinguishing accuracy on Celeb-A (Sex), for varying number of training epochs for victim models. Shaded regions correspond to error bars. Distribution inference risk increases as the model trains, and then starts to decrease as the model starts to overfit. </div> <h2 id="summary">Summary</h2> <p>The general approach to achieve security and privacy for machine-learning models is to add noise, but our evaluations suggest this approach is not a principled or effective defense against distribution inference. The main reductions in inference accuracy that result from these defenses seem to be due to the way they disrupt the model from learning the distribution well.</p> <p><b>Paper</b>: <a href="http://anshumansuri.com/">Anshuman Suri</a>, Yifu Lu, Yanjin Chen, <a href="http://www.cs.virginia.edu/~evans/">David Evans</a>. <a href="https://arxiv.org/abs/2212.07591"><em>Dissecting Distribution Inference</em></a>. In <a href="https://satml.org/"><em>IEEE Conference on Secure and Trustworthy Machine Learning</em></a> (SaTML), 8-10 February 2023.</p> <p><b>Code</b>: <a href="https://github.com/iamgroot42/dissecting_distribution_inference">https://github.com/iamgroot42/dissecting_distribution_inference</a></p>]]></content><author><name>Anshuman Suri</name></author><category term="paper"/><category term="property inference"/><category term="privacy"/><category term="distribution inference"/><summary type="html"><![CDATA[Describing our work on distribution inference attacks.]]></summary></entry><entry><title type="html">Running scripts on Rivanna at UVA</title><link href="https://anshumansuri.com/blog/2022/uva-rivanna/" rel="alternate" type="text/html" title="Running scripts on Rivanna at UVA"/><published>2022-02-03T00:00:00+00:00</published><updated>2022-02-03T00:00:00+00:00</updated><id>https://anshumansuri.com/blog/2022/uva-rivanna</id><content type="html" xml:base="https://anshumansuri.com/blog/2022/uva-rivanna/"><![CDATA[<h1 id="overview">Overview</h1> <p>Our department has a nice collection of 16 servers, each with 4 GPUs. While this may sound like a lot, the servers are shared across the department and thus fill out pretty fast. We do have a SLURM cluster too, but it’s not as fast (or big enough to run GPU-based jobs via SLURM) as the servers. I recently looked into <a href="https://www.rc.virginia.edu/userinfo/rivanna/overview/">Rivanna</a> after my <a href="https://www.cs.virginia.edu/~evans/">advisor</a> suggested it. I did write some sbatch scripts during my internship at <a href="https://labs.oracle.com/pls/apex/labs/r/labs/intro">Oracle Research Labs</a>, so that definitely came in handy while trying to write wrapper scripts for the Rivanna cluster.</p> <h1 id="settings-things-up">Settings things up</h1> <p>The structure for these environments here is pretty similar to what we have for the CS servers. You can load up specific modules using <code class="language-plaintext highlighter-rouge">module load</code>. Since sbatch files get passed onto in a new bash environment, it’s always a good idea to have all your <code class="language-plaintext highlighter-rouge">module load</code> and other related commands (like <code class="language-plaintext highlighter-rouge">conda activate</code>) in your <code class="language-plaintext highlighter-rouge">.bashrc</code> file so that you don’t have to worry about adding all of them to every sbatch file.</p> <p>For reference, here’s what I added to my <code class="language-plaintext highlighter-rouge">.bashrc</code>:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module load singularity
module load cudatoolkit/11.0.3-py3.8
module load cuda/11.4.2
module load cudnn/8.2.4.15
module load anaconda/2020.11-py3.8

<span class="c">#Identify whether using Rivanna version (load data accordingly)</span>
<span class="nb">export </span><span class="nv">ISRIVANNA</span><span class="o">=</span>1

<span class="c"># Conda-init related stuff that is auto-populated</span>
conda activate phd <span class="c"># phd is the name of my environment</span>
</code></pre></div></div> <p>The only downside here is that storage is not shared with the other CS servers: you must either commit to using only Rivanna or only CS servers, or make sure you regularly sync your generated code (which is straightforward, thanks to Git) and data (not so trivial).</p> <div class="alert alert-danger" role="alert"> The Rivanna cluster has a cronjob of sorts that <b>deletes files</b> that aren't accessed for more than <b>90 days</b>. </div> <p>There are two storage directories: <code class="language-plaintext highlighter-rouge">/home</code> and <code class="language-plaintext highlighter-rouge">/scratch</code>. The former has a 50GB quota limit with weekly snapshots for backup, while the <code class="language-plaintext highlighter-rouge">/scratch</code> directory has a quota of 10TB/350,000 files (whichever is more). As mentioned on the <a href="https://www.rc.virginia.edu/userinfo/rivanna/storage/">Rivanna Storage</a> page:</p> <blockquote> <p>Slurm jobs run against /home will be slower than those run against /scratch</p> </blockquote> <p>Thus, it is advisable to have all your scripts and data in the <code class="language-plaintext highlighter-rouge">/scratch</code> directory, even your Anaconda environment. You can specify a location for your Conda environment with the <code class="language-plaintext highlighter-rouge">--prefix &lt;PATH&gt;</code> flag while running <code class="language-plaintext highlighter-rouge">conda create</code>.</p> <h1 id="writing-sbatch-scripts">Writing SBATCH scripts</h1> <p>Okay, enough talk! Let’s start with a basic wrapper script- let’s call it <code class="language-plaintext highlighter-rouge">test.sbatch</code>. If you are already familiar with these options and/or just want to use the defaults and get started, feel free to use the template <a href="https://gist.github.com/iamgroot42/2a29141b5cb241aa82aa80809b420437">here</a>.</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/bash</span>

<span class="c">#SBATCH --ntasks=1</span>
<span class="c">#SBATCH -A uvasrg</span>
<span class="c">#SBATCH --mem=32G</span>
<span class="c">#SBATCH -p gpu</span>
<span class="c">#SBATCH --gres=gpu:rtx2080:1</span>
<span class="c">#SBATCH --cpus-per-task=10</span>
<span class="c">#SBATCH --time=3-00:00:00</span>
<span class="c">#SBATCH --output=logs/%x-%j.out</span>
<span class="c">#SBATCH --error=logs/%x-%j.err</span>

<span class="c"># Your command goes here</span>
<span class="nb">echo</span> <span class="s2">"Parameters were </span><span class="nv">$PARAM1</span><span class="s2"> and </span><span class="nv">$PARAM2</span><span class="s2">"</span>
</code></pre></div></div> <p>Wait, wait, wait! Aren’t the <code class="language-plaintext highlighter-rouge">#SBATCH</code> commands going to get ignored (starts with <code class="language-plaintext highlighter-rouge">#</code>, so it’s got to be a comment, duh?). Well, not really- <code class="language-plaintext highlighter-rouge">#SBATCH</code> commands are treated differently.</p> <h3 id="sbatch---ntasks1"><code class="language-plaintext highlighter-rouge">#SBATCH --ntasks=1</code></h3> <p>This argument specifies the number of tasks you want to run with the given script. In most cases, this will be 1 unless you want parallel execution and will utilize it explicitly via your script.</p> <p>For instance, you could specify <code class="language-plaintext highlighter-rouge">--ntasks=2</code> to run two tasks in parallel:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="nb">echo</span> <span class="s2">"Hello there"</span> &amp; 
 <span class="nb">echo</span> <span class="s2">"General Kenobi"</span> &amp;
 <span class="nb">wait</span>
</code></pre></div></div> <p>This can be particularly useful when you have some form of caching, utilize the same file across different scripts, or just like the idea of having all your experiments run on the same physical machine.</p> <h3 id="sbatch--a-uvasrg"><code class="language-plaintext highlighter-rouge">#SBATCH -A uvasrg</code></h3> <p>This option specifies which “allocation” you want to use. As a user at UVA (or SLURM clusters in general), you may be part of one or more allocation groups, which all have their compute budgets. This option ensures that your scripts run on the allocation group you specify. If you happen to be in <a href="https://uvasrg.github.io">UVASRG</a>, this is the option for you!</p> <p>If you’re curious about the compute budget used up/left in your allocation, you can run:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>allocations
</code></pre></div></div> <p>Furthermore, if you’re curious about other members in the same allocation group, you can run:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>allocations <span class="nt">-a</span> &lt;your_group_name&gt;
</code></pre></div></div> <h3 id="sbatch---mem32g"><code class="language-plaintext highlighter-rouge">#SBATCH --mem=32G</code></h3> <p>This is pretty straightforward and lets you specify the total memory (in GBs) you want to allocate to your job. In most cases, 32 or 64 GBs is enough.</p> <div class="alert alert-info" role="alert"> Rivanna has an upper limit of 32 GBs of memory per core, so you'll need to be careful with this. </div> <h3 id="sbatch--p-gpu"><code class="language-plaintext highlighter-rouge">#SBATCH -p gpu</code></h3> <p>This option here specifies which partition you want your scripts to run on. For most users (at least with machine learning), this will be ‘gpu’. You can have a look at all the available partitions <a href="https://www.rc.virginia.edu/userinfo/rivanna/queues/">here</a>.</p> <h3 id="sbatch---gresgpurtx20801"><code class="language-plaintext highlighter-rouge">#SBATCH --gres=gpu:rtx2080:1</code></h3> <p>This option here (the <code class="language-plaintext highlighter-rouge">--gres</code> in general) allows you to specify configurations of the nodes you want to run your scripts on. In this case, the <code class="language-plaintext highlighter-rouge">rtx2080</code> is the name of the GPU card, and <code class="language-plaintext highlighter-rouge">1</code> is the number of cards you want to use. In terms of speed, you might want to prefer v100 over 2080.</p> <p>One thing to note here: there may be times when machines with one type of GPU card are free while others are in use. It might be better to specify the other GPU in such cases instead of waiting for the busy GPU machines to clear up. But how exactly would you know which machines are free and which ones are not?</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sinfo <span class="nt">-o</span> <span class="s2">"%20N %10R %10e %25f %25G %t %C"</span> <span class="nt">-t</span> IDLE,MIX
</code></pre></div></div> <p>This provides information on the status of all machines, their GPU cards (and how many they have), free memory, and free/busy CPU cores. Pretty useful, huh!</p> <h3 id="sbatch---cpus-per-task10"><code class="language-plaintext highlighter-rouge">#SBATCH --cpus-per-task=10</code></h3> <p>This option lets you specify the number of CPUs you require per task in your script. From what I know, Rivanna has an upper limit of 10 per job for the GPU servers, but I’m not too sure about it.</p> <h3 id="sbatch---time3-000000"><code class="language-plaintext highlighter-rouge">#SBATCH --time=3-00:00:00</code></h3> <p>This option here lets you specify an upper limit for your job. SLURM will terminate any jobs after this given time limit (set to a default value much lower than 3 days for Rivanna, from what I remember). We have an upper limit of 3 days, so this pretty much tells SLURM to run the job for as long as possible. If you want to run your job for longer, make sure you cache results so that the re-run can pick up from where the previous job left off.</p> <h3 id="sbatch---outputlogsx-jout"><code class="language-plaintext highlighter-rouge">#SBATCH --output=logs/%x-%j.out</code></h3> <p>These two formats (this and <code class="language-plaintext highlighter-rouge">#SBATCH --error=logs/%x-%j.err</code>) specify the filenames for error and log files. The <code class="language-plaintext highlighter-rouge">%x</code> is the script’s name, and <code class="language-plaintext highlighter-rouge">%j</code> is the job ID.</p> <div class="alert alert-secondary" role="alert"> P.S. You might wanna make sure you have the directory `logs` where you submit your job. </div> <h2 id="submitting-a-job">Submitting a job</h2> <p>There you go! Now that you know what each flag here means, let’s get to submitting the job:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code> sbatch <span class="nt">--export</span><span class="o">=</span>ALL,PARAM1<span class="o">=</span><span class="s1">'no'</span>,PARAM2<span class="o">=</span>42 <span class="nt">--job-name</span><span class="o">=</span>try test.sbatch
</code></pre></div></div> <p>Note how we have two additional parameters, <code class="language-plaintext highlighter-rouge">PARAM1</code> and <code class="language-plaintext highlighter-rouge">PARAM2</code>. This is a way for you to pass parameters to your job, which can then be used in your script. Make sure you leave the <code class="language-plaintext highlighter-rouge">ALL</code> in there since this passes on your environment’s predefined variables to the job.</p> <div class="alert alert-danger" role="alert"> Another sidenote (that I found out the weird way) - make sure all your flags and options are specified <b>before</b> the filename of your script. If they aren't, `sbatch` will simply ignore them. </div> <h1 id="job-arrays">Job Arrays</h1> <p>Sometimes you might want to run the same commands inside but with different parameters- maybe you want to run a grid search or generate results for multiple datasets. Instead of creating new sbatch scripts in this case, you can utilize job arrays. This is a way to run multiple jobs with the same configuration but with different parameters.</p> <p>For instance, you could run a job array with the following command:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sbatch <span class="nt">--job-name</span><span class="o">=</span>try test.sbatch <span class="nt">--array</span><span class="o">=</span>0-6
</code></pre></div></div> <p>Note the additional <code class="language-plaintext highlighter-rouge">--array=0-6</code> option. This specifies that you want to run 7 jobs, starting from 0 until 6 (all-inclusive). You could then modify the sbatch script to use different parameters based on the job index:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>...
<span class="nv">RATIOS</span><span class="o">=(</span>0.2 0.3 0.4 0.5 0.6 0.7 0.8<span class="o">)</span>
<span class="nb">echo</span> <span class="s2">"Running for ratio </span><span class="k">${</span><span class="nv">RATIOS</span><span class="p">[</span><span class="nv">$SLURM_ARRAY_TASK_ID</span><span class="p">]</span><span class="k">}</span><span class="s2">"</span>
...
</code></pre></div></div> <h1 id="tips-and-tricks">Tips and Tricks</h1> <h2 id="start-time-estimate">Start-Time Estimate</h2> <p>Submitted a job, but it’s stuck in <code class="language-plaintext highlighter-rouge">(Priority)</code> or <code class="language-plaintext highlighter-rouge">(Resources)</code>? If you see it’s way too much, you can always default back to some other servers (or change the configuration of your job so that it gets accepted). In cases like these, getting a rough estimate for the start time for your job can be helpful.</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code> squeue <span class="nt">-u</span> &lt;your_username&gt; <span class="nt">--start</span>
</code></pre></div></div> <p>If you skip the <code class="language-plaintext highlighter-rouge">--start</code> above, you can view all your jobs and relevant information: how long they’ve been running for, their status, etc.</p> <h2 id="opening-interactive-sessions">Opening interactive sessions</h2> <p>Sometimes you may want to test your code out for smaller cases and/or debug, and going to and fro with log files may not be very convenient. You can open an interactive session with your job using the following command:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code> ijob <span class="nt">-A</span> uvasrg <span class="nt">-p</span> gpu <span class="nt">--time</span><span class="o">=</span>0-00:30:00 <span class="nt">--gres</span><span class="o">=</span>gpu:rtx2080:1 <span class="nt">--mem</span><span class="o">=</span>8G 
</code></pre></div></div> <p>This command here would open an interactive session (capped at 30 minutes, so that you don’t accidentally leave it on and get charged for it) with your job, run on a machine with the RTX2080 GPU card and an allocation of 8GB memory.</p> <h2 id="adjusting-resources-based-on-job-efficiency">Adjusting resources based on job efficiency</h2> <p>It may so happen that you over-estimate the resources needed for your job, which can lead to higher allocation consumption as well as your scripts running at a later time (because of busy resources). A good starting point is to look at runs of your completed scripts:</p> <div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">you@machine:~$</span><span class="w"> </span>sacct <span class="nt">--starttime</span><span class="o">=</span>2022-02-04 <span class="nt">--endtime</span><span class="o">=</span>2022-02-11 <span class="nt">--state</span> COMPLETED  <span class="nt">-u</span> &lt;your_username&gt;
<span class="go">         JobID    JobName  Partition    Account  AllocCPUS      State ExitCode 
------------ ---------- ---------- ---------- ---------- ---------- -------- 
</span><span class="gp">32259190       job1        gpu     &lt;acc_name&gt;</span><span class="w">         </span>16  COMPLETED      0:0 
<span class="gp">32259191       job2        gpu     &lt;acc_name&gt;</span><span class="w">         </span>16  COMPLETED      0:0 
</code></pre></div></div> <p>This command here, for instance, will give you a list of all completed jobs that started and finished between 4th and 11th February, 2022. You can then pick any jobID that you like and look at its CPU and memory usage efficiency:</p> <div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">you@machine:~$</span><span class="w"> </span>seff 32259190
<span class="go">Job ID: 32259190
Cluster: shen
</span><span class="gp">User/Group: &lt;your_username&gt;</span>/users
<span class="go">State: COMPLETED (exit code 0)
Nodes: 1
Cores per node: 16
CPU Utilized: 1-20:54:32
CPU Efficiency: 21.72% of 8-14:47:28 core-walltime
Job Wall-clock time: 12:55:28
Memory Utilized: 5.72 GB
Memory Efficiency: 17.86% of 32.00 GB
</span></code></pre></div></div> <h2 id="running-gpu-based-scripts">Running GPU-based scripts</h2> <p>It’s good practice to specify the GPU card you want to use for your job. Most users might be familiar with <code class="language-plaintext highlighter-rouge">CUDA_VISIBLE_DEVICES</code> and setting this value before their experiments, or via code (in PyTorch, for instance). However, the machines that SLURM runs scripts on have shared GPUs that are visible to each job for some reason. As a result, even if you request one GPU and your job is run on a machine with 8 GPUs, using <code class="language-plaintext highlighter-rouge">export CUDA_VISIBLE_DEVICES=0</code> will, instead of doing nothng (since you’d expecte the visible compute environment to have only one GPU), the job will run on the 1st GPU, even if it’s not the one that’s free.</p> <p>I found this out the hard way (not mentioned on the website documentation, but the support staff were nice enough to tell me about it)- I submitted 8-10 jobs and they were all really slow, since they all got assigned the same machine and the <code class="language-plaintext highlighter-rouge">CUDA_VISIBLE_DEVICES</code> made them run on the same GPU.</p> <div class="alert alert-danger" role="alert"> tl;dr Do not set CUDA_VISIBLE_DEVICES (via envionment variable or within code) for your jobs- SLURM manages that for you. </div>]]></content><author><name>Anshuman Suri</name></author><category term="guide"/><category term="slurm"/><category term="rivanna guide"/><summary type="html"><![CDATA[A tutorial on how to run scripts on Rivanna (SLURM in general) cluster at UVA, along with some tricks.]]></summary></entry><entry><title type="html">On the Risks of Distribution Inference</title><link href="https://anshumansuri.com/blog/2021/distr-inf/" rel="alternate" type="text/html" title="On the Risks of Distribution Inference"/><published>2021-06-27T00:00:00+00:00</published><updated>2021-06-27T00:00:00+00:00</updated><id>https://anshumansuri.com/blog/2021/distr-inf</id><content type="html" xml:base="https://anshumansuri.com/blog/2021/distr-inf/"><![CDATA[<p>Inference attacks seek to infer sensitive information about the training process of a revealed machine-learned model, most often about the training data.</p> <p>Standard inference attacks (which we call “dataset inference attacks”) aim to learn something about a particular record that may have been in that training data. For example, in a membership inference attack <d-cite key="shokri2017membership"> </d-cite>, the adversary aims to infer whether or not a particular record was included in the training data.</p> <p>Differential Privacy provides a theoretical notion of privacy that maps well to membership inference attacks. However, it provides privacy at the dataset level. Thus, it doesn’t capture attacks that violate privacy at the distribution level. This is where property inference comes in. Property inference, a different kind of inference risk, involves an adversary that aims to infer some statistical property of the training distribution.</p> <p>We illustrate the kind of risks introduced by property inference via a fictional example. Skynet, an (imaginary) organization that handles private data, releases a machine learning model \(M\) trained on their network flow graphs to predict faulty nodes in a network of servers. However, an adversary (\(\mathcal{A}\)) that wishes to launch a bot-net into this cluster of servers sees an opportunity in this model. They seek to infer whether the effective diameter (\(90^{th}\) percentile of all pair-wise shortest paths) of the network is below 6 (\(\mathcal{D}_0\)) or not (\(\mathcal{D}_1\)).</p> <p>We picked this property as an example based on useful properties cited in the traffic classification literature (e.g. <d-cite key="iliofotou2009graph"> </d-cite>). Learning this property might be useful for the adversary in crafting a bot-net that would not be detected by Skynet’s bot-detection software. The main point of the illustration is to convey that an adversary can infer properties of the underlying data distribution that a model producer would not expect and that might be valuable to the adversary.</p> <h2 id="formalizing-property-inference">Formalizing Property Inference</h2> <p>To formalize property inference attacks, we adapt the cryptographic game for membership inference proposed by Yeom et al. <d-cite key="yeom2018privacy"> </d-cite>:</p> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/dist_inf/yeom-480.webp 480w,/assets/img/dist_inf/yeom-800.webp 800w,/assets/img/dist_inf/yeom-1400.webp 1400w," type="image/webp" sizes="95vw"/> <img src="/assets/img/dist_inf/yeom.png" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <p>In this game, the victim samples a dataset S from the distribution \(\mathcal{D}\) and trains a model $M$ on it. It then samples some data-point $z$ from either $S$ or \(\mathcal{D}\), based on \(b \xleftarrow{R}\){0,1}. The adversary then tries to infer $b$ using algorithm $H$, given access to (\(z\), \(\mathcal{D}\), \(M\)). This cryptographic game captures the intuitive notion of membership inference. It focuses on a particular dataset and sample: inferring whether a given data point was part of training data.</p> <p>In contrast, property inference focuses on properties of the underlying distribution ($\mathcal{D}$), not the dataset ($S$) itself. To capture property inference, we propose a similar cryptographic game. Instead of differentiating between the sources of a specific data point (\(S\) or \(\mathcal{D}\)), we propose distinguishing between two distributions, \(\mathcal{D}_0\) and \(\mathcal{D}_1\).</p> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/dist_inf/distr-480.webp 480w,/assets/img/dist_inf/distr-800.webp 800w,/assets/img/dist_inf/distr-1400.webp 1400w," type="image/webp" sizes="95vw"/> <img src="/assets/img/dist_inf/distr.png" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <p>A model trainer \(\mathcal{B}\) samples a dataset $D$ from either of the distributions \(\mathcal{D}_0\), \(\mathcal{D}_1\). These distributions can be obtained from the publicly know distribution \(\mathcal{D}\) by applying functions \(\mathcal{G}_0\), \(\mathcal{G}_1\) respectively, that transform distributions (and represent the “property” an adversary might care about). So, we formalize distribution inference with this question:</p> <blockquote> <p>Given a model trained on this dataset $D$ drawn from either distribution \(\mathcal{D}_0\) or \(\mathcal{D}_1\), can an adversary infer from which of \(\mathcal{D}_0\), \(\mathcal{D}_1\) the dataset was sampled?</p> </blockquote> <p>Frameworks like Differential Privacy do not apply here: the adversary here cares about statistical properties of the distribution the model was trained on, not details about a particular sampled dataset.</p> <h2 id="evaluating-risk-of-property-inference">Evaluating Risk of Property Inference</h2> <p>Most often in the literature, the adversary considers the ratio of members in a dataset satisfying a particular Boolean function \(f\) as the “property.” It then aims to distinguish between models trained on datasets with different proportions.</p> <p>However, these experiments often test with arbitrary ratios, making it hard to understand the relative risk of different properties. Some examples are Chase et al. <d-cite key="mahloujifar2022property"> </d-cite> which considers 0.1 v/s 0.25, and Zhang et al. <d-cite key="zhang2021leakage"></d-cite> which considers 0.33 v/s 0.67.</p> <p>To better understand how well an intuitive notion of divergence in properties aligns with observed inference risk, we execute property inference attacks with increasing diverging properties. We fix one property (ratio=0.5) and vary the other ($\alpha$). We perform these experiments for three datasets: focusing on the ratio of females for the <a href="https://dl.acm.org/doi/pdf/10.1145/380995.381030">US Census</a> and <a href="https://pubs.rsna.org/doi/pdf/10.1148/radiol.2018180736">RSNA BoneAge</a> datasets, and the average node-degree for the <a href="https://direct.mit.edu/qss/article/1/1/396/15572/Microsoft-Academic-Graph-When-experts-are-not">OGBN arXiv</a> dataset.</p> <p>The state-of-the-art method for property inference attacks involves meta-classifiers, usually using Permutation Invariant Networks (Ganju et al., <d-cite key="ganju2018property"></d-cite>). After training hundreds or thousands of models locally, the adversary trains a meta-classifier on model parameters.</p> <figure> <video src="/assets/img/dist_inf/PIM-Animation.mp4" class="img-fluid rounded z-depth-1" width="auto" height="auto" autoplay="" controls=""/> </figure> <div class="caption"> Illustration of the functioning of a Permutation Invariant Network. The process of model-parameter extraction involves constructing permutation-invariant representations of neurons per layer $h_i$ using learnable parameters ($\phi_i$). These representations are then joined together for all layers with another learnable transform $\rho$, yielding the meta-classifier’s predictions. </div> <p>We use two simple attacks (using only model outputs) as baselines:</p> <ul> <li><strong>Loss Test</strong>: predict the property based on its performance on data from the same distribution it was trained, compared to the other distribution being analyzed.</li> <li><strong>Threshold Test</strong>: extends the loss test by calibrating performance trends on a small set of models and arriving at a threshold based on model performance.</li> </ul> <h2 id="experimental-results">Experimental Results</h2> <p>Our results demonstrate how a meta-classifier can differentiate between models with ratios as similar as 0.5 and 0.6:</p> <div class="row mt-3"> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/dist_inf/census_meta-480.webp 480w,/assets/img/dist_inf/census_meta-800.webp 800w,/assets/img/dist_inf/census_meta-1400.webp 1400w," type="image/webp" sizes="95vw"/> <img src="/assets/img/dist_inf/census_meta.png" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/dist_inf/rsna_meta-480.webp 480w,/assets/img/dist_inf/rsna_meta-800.webp 800w,/assets/img/dist_inf/rsna_meta-1400.webp 1400w," type="image/webp" sizes="95vw"/> <img src="/assets/img/dist_inf/rsna_meta.png" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> </div> <div class="caption"> Figure 1: Differentiating between models trained on datasets trained with 50% females v/s females. Orange crosses are for the Loss Test; green with error bars are the Threshold Test; the blue box-plots are the meta-classifiers. </div> <p><sub></sub></p> <p>The meta-classifier attacks provide the best predictions, but the loss-test and threshold-test can serve as valuable baselines — even these simple attacks provide accuracies significantly better than random-guessing.</p> <h3 id="inferring-graph-properties">Inferring Graph Properties</h3> <p>Our proposed definitions allow the property to hold over the whole dataset, not just aggregate statistics like mean ratio. Thus, we focus on node-classification for a graph: differentiating between versions of the graph with varying mean node-degrees as the property. We fix one property (mean node-degree 13) and vary the other ($\alpha$). Inferring the mean node-degree is a novel property inference task since the property here holds over the entirety of training data- no such property has been explored in the literature yet.</p> <div class="row mt-3"> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/dist_inf/arxiv-480.webp 480w,/assets/img/dist_inf/arxiv-800.webp 800w,/assets/img/dist_inf/arxiv-1400.webp 1400w," type="image/webp" sizes="95vw"/> <img src="/assets/img/dist_inf/arxiv.png" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/dist_inf/arxiv_degree-480.webp 480w,/assets/img/dist_inf/arxiv_degree-800.webp 800w,/assets/img/dist_inf/arxiv_degree-1400.webp 1400w," type="image/webp" sizes="95vw"/> <img src="/assets/img/dist_inf/arxiv_degree.png" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> </div> <div class="caption"> Figure 2 (Left): Differentiating between models trained on datasets trained with mean node-degree 13 v/s on the OGBN arXiv dataset. Figure 3 (Right): Predicting the mean node-degree of training data graphs directly with a meta-classifier on the OGBN arXiv dataset. </div> <p>Our results demonstrate how a meta-classifier can also be trained to directly infer the mean-node degree of graphs (Figure 2). Encouraged by the success of meta-classifiers for this task, we also tried a meta-classifier variant to predict the mean-node degree of training graphs (Figure 3). The resulting meta-classifier even generalizes well, accurately predicting mean node-degrees for distributions ($\alpha$={12.5, 13.5}) that it hasn’t seen.</p> <h2 id="summary">Summary</h2> <p>Our work on distribution inference formalizes and shows how property inference attacks can indeed infer distribution-level properties. Our ongoing work is focused on quantifying and studying this ‘privacy leakage’ of properties and its implications.</p> <p><a href="http://anshumansuri.com/">Anshuman Suri</a>, <a href="http://www.cs.virginia.edu/~evans/">David Evans</a>. <a href="/publication/distribution-inference/"><em>Formalizing Distribution Inference Risks</em></a> - Workshop on Theory and Practice of Differential Privacy, ICML 2021.</p> <p>Code: <a href="https://github.com/iamgroot42/distribution_inference">https://github.com/iamgroot42/distribution_inference</a></p>]]></content><author><name>Anshuman Suri</name></author><category term="paper"/><category term="property inference"/><category term="privacy"/><category term="distribution inference"/><summary type="html"><![CDATA[A blog post describing our work on Property Inference attacks.]]></summary></entry><entry><title type="html">Reassessing adversarial training with fixed data augmentation</title><link href="https://anshumansuri.com/blog/2021/advrob-aug/" rel="alternate" type="text/html" title="Reassessing adversarial training with fixed data augmentation"/><published>2021-06-24T00:00:00+00:00</published><updated>2021-06-24T00:00:00+00:00</updated><id>https://anshumansuri.com/blog/2021/advrob-aug</id><content type="html" xml:base="https://anshumansuri.com/blog/2021/advrob-aug/"><![CDATA[<h2 id="overview">Overview</h2> <p>A couple months ago, a <a href="https://www.reddit.com/r/MachineLearning/comments/mocpgj/p_using_pytorch_numpy_a_bug_that_plagues/">post on Reddit</a> highlighted a bug in PyTorch + NumPy that affects how data augmentation works (see image above). Knowing nearly all of my projects use this combination, I read through the <a href="https://tanelp.github.io/posts/a-bug-that-plagues-thousands-of-open-source-ml-projects/">linked blog</a> by Tanel Pärnamaa to see what it was all about. I was a bit shocked that it took our community this long to notice a bug this severe! Nearly all data-loaders use more than one worker. Unfortunately, not many people (clearly, since it took us all so long to notice this bug) sit down to debug data augmentation at this level within their ML pipeline.</p> <p>Reading through this bug, I remembered how (proper) data-augmentation had been proposed as a means to reduce robust overfitting by authors at DeepMind <d-cite key="rebuffi2021fixing"> </d-cite>. This paper got me thinking: “Could fixing this augmentation bug and rerunning adversarial training lead to gains in robustness?”. Curious to see the impact of fixing this data augmentation bug, I decided to run some experiments of my own. You can head over to <a href="https://github.com/iamgroot42/aug_robust_blogpost">the repository</a> and run them yourself if you want.</p> <h2 id="experiments">Experiments</h2> <p>I chose the CIFAR-10 dataset: small enough to iterate experiments fast and challenging enough to observe performance gains.</p> <h3 id="standard-training">Standard Training</h3> <p>Interestingly, standard training with the fixed data-augmentation pipeline <strong>hurt</strong> performance a bit, compared to using faulty augmentation:</p> <table> <thead> <tr> <th>Model</th> <th>Standard Accuracy (%)</th> <th>Robust Accuracy (ε = 8/255) (%)</th> </tr> </thead> <tbody> <tr> <td>Standard</td> <td>89.140</td> <td>0.000</td> </tr> <tr> <td>Standard (augmentation)</td> <td>94.720</td> <td>0.000</td> </tr> <tr> <td>Standard (fixed augmentation)</td> <td>94.620</td> <td>0.000</td> </tr> </tbody> </table> <h3 id="adversarial-training">Adversarial Training</h3> <p>Not thinking much about the 0.1% performance drop (probably statistical noise, right?), I ran adversarial training with \(L_\infty\) robustness (\(\epsilon=\frac{8}{255}\)):</p> <table> <thead> <tr> <th>Model</th> <th>Standard Accuracy (%)</th> <th>Robust Accuracy (ε = 8/255) (%)</th> <th>Robust Accuracy (ε = 16/255) (%)</th> </tr> </thead> <tbody> <tr> <td>Robust</td> <td>79.520</td> <td>44.370</td> <td>15.680</td> </tr> <tr> <td>Robust (augmentation)</td> <td>86.320</td> <td>51.400</td> <td>17.480</td> </tr> <tr> <td>Robust (fixed augmentation)</td> <td>86.730</td> <td>51.880</td> <td>17.570</td> </tr> </tbody> </table> <p>As visible here, there’s an absolute 0.4% performance gain for $\epsilon=\frac{8}{255}$, and 0.09% performance gain for $\epsilon=\frac{4}{255}$, when using the fixed augmentation pipeline. Although the 0.09% here is not very significant, the 0.4% improvement seems non-trivial. This improvement is especially significant compared to the kind of performance differences reported on <a href="https://robustbench.github.io/#div_cifar10_Linf_heading">benchmarks</a> for this dataset. Additionally, accuracy on clean data sees an improvement as well: absolute 0.41% change.</p> <p>Not wanting to make any claims based on experiments on just the $L_\infty$ norm, I reran the same set of experiments for the \(L_2\) norm (\(\epsilon=1\)).</p> <table> <thead> <tr> <th>Model</th> <th>Standard Accuracy (%)</th> <th>Robust Accuracy (%), ε = 0.5</th> <th>Robust Accuracy (%), ε = 1</th> </tr> </thead> <tbody> <tr> <td>Robust</td> <td>78.190</td> <td>61.740</td> <td>42.830</td> </tr> <tr> <td>Robust (augmentation)</td> <td>80.560</td> <td>67.200</td> <td>51.140</td> </tr> <tr> <td>Robust (fixed augmentation)</td> <td>81.070</td> <td>67.620</td> <td>51.220</td> </tr> </tbody> </table> <p>Performance gains appear in this case as well. Accuracy on clean data bumps up by 0.51%, while robustness on \(\epsilon=0.5\) and \(\epsilon=1.0\) improves by 0.42% and 0.08%, respectively. The fact that this case sees a consistent, albeit small, improvement in both clean and perturbed-data performance hints at how simply fixing this augmentation may provide a nice bump in existing training methods. It is very much possible that these gains are just coincidental fluctuations in the randomness of model training. Regardless, fixing data-loaders is something that should be done anyway. The goal of these experiments was to try and quantify the impact of improper augmentation. It would be great if someone with sufficient resources could run these experiments on a larger scale to rule out statistical noise.</p> <h2 id="takeaway">Takeaway</h2> <p>Fixing data augmentation can have a non-trivial (and positive) impact when training for robustness. Anyone training robust models (especially with adversarial training, since that is what I tested on) should fix their data-loaders.</p>]]></content><author><name>Anshuman Suri</name></author><category term="exploration"/><category term="robustness"/><category term="pytorch bug"/><category term="adversarial training"/><summary type="html"><![CDATA[A recent bug discovery on Pytorch+Numpy got me thinking- how much does this bug impact adversarial robustness?]]></summary></entry></feed>