<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Test on 🌸沐雪-medicago🎀</title>
    <link>https://3bfd66bd.hugo-blog-asi.pages.dev/tags/test/</link>
    <description>Recent content from 🌸沐雪-medicago🎀</description>
    <generator>Hugo</generator>
    <language>en-us</language>
    
    <managingEditor>medicago.19@outlook.com (medicago)</managingEditor>
    <webMaster>medicago.19@outlook.com (medicago)</webMaster>
    
    <copyright>All articles on this blog are licensed under the BY-NC-SA license agreement unless otherwise stated. Please indicate the source when reprinting!</copyright>
    
    <lastBuildDate>Thu, 09 Jun 2022 20:12:52 +0800</lastBuildDate>
    
    
    <atom:link href="https://3bfd66bd.hugo-blog-asi.pages.dev/tags/test/index.xml" rel="self" type="application/rss&#43;xml" />
    

    
    

    <item>
      <title>Markdown Basic Elements</title>
      <link>https://3bfd66bd.hugo-blog-asi.pages.dev/post/main/</link>
      <pubDate>Thu, 09 Jun 2022 20:12:52 &#43;0800</pubDate>
      <author>medicago.19@outlook.com (medicago)</author>
      <guid>https://3bfd66bd.hugo-blog-asi.pages.dev/post/main/</guid>
      <description>
        <![CDATA[<h1>Markdown Basic Elements</h1><p>Author: medicago(medicago.19@outlook.com)</p>
        
          <h2 id="markdown-基本元素">
<a class="header-anchor" href="#markdown-%e5%9f%ba%e6%9c%ac%e5%85%83%e7%b4%a0"></a>
Markdown 基本元素
</h2><h3 id="标题">
<a class="header-anchor" href="#%e6%a0%87%e9%a2%98"></a>
标题
</h3><h1 id="h1">
<a class="header-anchor" href="#h1"></a>
H1
</h1><h2 id="h2">
<a class="header-anchor" href="#h2"></a>
H2
</h2><h3 id="h3">
<a class="header-anchor" href="#h3"></a>
H3
</h3><h4 id="h4">
<a class="header-anchor" href="#h4"></a>
H4
</h4><h5 id="h5">
<a class="header-anchor" href="#h5"></a>
H5
</h5><h6 id="h6">
<a class="header-anchor" href="#h6"></a>
H6
</h6><h3 id="强调">
<a class="header-anchor" href="#%e5%bc%ba%e8%b0%83"></a>
强调
</h3><p>Emphasis, aka italics, with <em>asterisks</em> or <em>underscores</em>.</p>
<p>Strong emphasis, aka bold, with <strong>asterisks</strong> or <strong>underscores</strong>.</p>
<p>Combined emphasis with <strong>asterisks and <em>underscores</em></strong>.</p>
<p>Strikethrough uses two tildes. <del>Scratch this.</del></p>
<h3 id="列表">
<a class="header-anchor" href="#%e5%88%97%e8%a1%a8"></a>
列表
</h3><h4 id="definition-list-dl">
<a class="header-anchor" href="#definition-list-dl"></a>
Definition List (dl)
</h4><!-- raw HTML omitted -->
<h4 id="ordered-list-ol">
<a class="header-anchor" href="#ordered-list-ol"></a>
Ordered List (ol)
</h4><ol>
<li>List Item 1</li>
<li>List Item 2</li>
<li>List Item 3</li>
</ol>
<h4 id="unordered-list-ul">
<a class="header-anchor" href="#unordered-list-ul"></a>
Unordered List (ul)
</h4><ul>
<li>List Item 1</li>
<li>List Item 2</li>
<li>List Item 3</li>
</ul>
<h3 id="段落">
<a class="header-anchor" href="#%e6%ae%b5%e8%90%bd"></a>
段落
</h3><p>Lorem ipsum dolor sit amet, <a href="">test link</a> consectetur adipiscing elit. <strong>Strong text</strong> pellentesque ligula commodo viverra vehicula. <em>Italic text</em> at ullamcorper enim. Morbi a euismod nibh. <!-- raw HTML omitted -->Underline text<!-- raw HTML omitted --> non elit nisl. <del>Deleted text</del> tristique, sem id condimentum tempus, metus lectus venenatis mauris, sit amet semper lorem felis a eros. Fusce egestas nibh at sagittis auctor. Sed ultricies ac arcu quis molestie. Donec dapibus nunc in nibh egestas, vitae volutpat sem iaculis. Curabitur sem tellus, elementum nec quam id, fermentum laoreet mi. Ut mollis ullamcorper turpis, vitae facilisis velit ultricies sit amet. Etiam laoreet dui odio, id tempus justo tincidunt id. Phasellus scelerisque nunc sed nunc ultricies accumsan.</p>
        
        <hr><p>Published on 2022-06-09 at <a href='https://3bfd66bd.hugo-blog-asi.pages.dev/'>🌸沐雪-medicago🎀</a>, last modified on 2022-06-09</p>]]>
      </description>
      
        <category>test</category>
      
    </item>
    
    

    <item>
      <title>BitNet（1.58‑bit）体系</title>
      <link>https://3bfd66bd.hugo-blog-asi.pages.dev/post/llm_bitnet/</link>
      <pubDate>Thu, 09 Jun 2022 20:12:52 &#43;0800</pubDate>
      <author>medicago.19@outlook.com (medicago)</author>
      <guid>https://3bfd66bd.hugo-blog-asi.pages.dev/post/llm_bitnet/</guid>
      <description>
        <![CDATA[<h1>BitNet（1.58‑bit）体系</h1><p>Author: medicago(medicago.19@outlook.com)</p>
        
          <h1 id="-1-bitnet158bit体系">
<a class="header-anchor" href="#-1-bitnet158bit%e4%bd%93%e7%b3%bb"></a>
# 1. BitNet（1.58‑bit）体系
</h1><h2 id="q1bitnet-是什么为什么叫-158bit">
<a class="header-anchor" href="#q1bitnet-%e6%98%af%e4%bb%80%e4%b9%88%e4%b8%ba%e4%bb%80%e4%b9%88%e5%8f%ab-158bit"></a>
<strong>Q1：BitNet 是什么？为什么叫 1.58‑bit？</strong>
</h2><h3 id="-bitnet-权重只有-3-个值">
<a class="header-anchor" href="#-bitnet-%e6%9d%83%e9%87%8d%e5%8f%aa%e6%9c%89-3-%e4%b8%aa%e5%80%bc"></a>
✔ BitNet 权重只有 3 个值：
</h3>$$
 {-1,\ 0,\ +1} 
$$<h3 id="-表示-3-个状态所需的最小信息量">
<a class="header-anchor" href="#-%e8%a1%a8%e7%a4%ba-3-%e4%b8%aa%e7%8a%b6%e6%80%81%e6%89%80%e9%9c%80%e7%9a%84%e6%9c%80%e5%b0%8f%e4%bf%a1%e6%81%af%e9%87%8f"></a>
✔ 表示 3 个状态所需的最小信息量：
</h3>$$
 \log_2(3) = 1.58496 
$$<p>所以叫 <strong>1.58‑bit</strong>（理论值）。<br>
实际存储用 <strong>2-bit + 压缩</strong> 达到平均 1.58-bit。</p>
<hr>
<h2 id="q2bitnet-的激活是什么格式">
<a class="header-anchor" href="#q2bitnet-%e7%9a%84%e6%bf%80%e6%b4%bb%e6%98%af%e4%bb%80%e4%b9%88%e6%a0%bc%e5%bc%8f"></a>
<strong>Q2：BitNet 的激活是什么格式？</strong>
</h2><p>激活也是 <strong>ternary（−1, 0, +1）</strong>。</p>
<hr>
<h2 id="q3bitnet-的训练方式是什么">
<a class="header-anchor" href="#q3bitnet-%e7%9a%84%e8%ae%ad%e7%bb%83%e6%96%b9%e5%bc%8f%e6%98%af%e4%bb%80%e4%b9%88"></a>
<strong>Q3：BitNet 的训练方式是什么？</strong>
</h2><p>必须使用：</p>
<ul>
<li>FP32 shadow weights</li>
<li>ternary quantization</li>
<li>STE（Straight‑Through Estimator）</li>
<li>特殊初始化</li>
<li>特殊正则（鼓励权重靠近 −1/0/+1）</li>
<li>特殊优化器</li>
<li>特殊激活函数</li>
</ul>
<p><strong>不能直接量化 FP16 模型得到 BitNet。</strong></p>
<hr>
<h2 id="q4bitnet-的推理框架bitnetcpp能训练吗">
<a class="header-anchor" href="#q4bitnet-%e7%9a%84%e6%8e%a8%e7%90%86%e6%a1%86%e6%9e%b6bitnetcpp%e8%83%bd%e8%ae%ad%e7%bb%83%e5%90%97"></a>
<strong>Q4：BitNet 的推理框架（bitnet.cpp）能训练吗？</strong>
</h2><p>不能。<br>
bitnet.cpp <strong>只负责推理</strong>，不包含训练代码。</p>
<hr>
<h2 id="q5bitnet-的优点是什么">
<a class="header-anchor" href="#q5bitnet-%e7%9a%84%e4%bc%98%e7%82%b9%e6%98%af%e4%bb%80%e4%b9%88"></a>
<strong>Q5：BitNet 的优点是什么？</strong>
</h2><ul>
<li>推理速度极快（CPU 上可达 2–6×）</li>
<li>能耗降低 55–82%</li>
<li>权重和激活都可用查表 / bitwise 实现</li>
<li>理论上可在 CPU 上跑超大模型（100B）</li>
</ul>
<hr>
<h2 id="q6bitnet-的缺点是什么">
<a class="header-anchor" href="#q6bitnet-%e7%9a%84%e7%bc%ba%e7%82%b9%e6%98%af%e4%bb%80%e4%b9%88"></a>
<strong>Q6：BitNet 的缺点是什么？</strong>
</h2><ul>
<li>训练极难</li>
<li>表达能力弱</li>
<li>需要大量工程技巧</li>
<li>目前没有高质量的大模型（7B/13B/70B）成功训练</li>
<li>第三方模型质量明显低于 FP16</li>
</ul>
<hr>
<h1 id="-2-stestraightthrough-estimator">
<a class="header-anchor" href="#-2-stestraightthrough-estimator"></a>
# 2. STE（Straight‑Through Estimator）
</h1><h2 id="q1为什么量化需要-ste">
<a class="header-anchor" href="#q1%e4%b8%ba%e4%bb%80%e4%b9%88%e9%87%8f%e5%8c%96%e9%9c%80%e8%a6%81-ste"></a>
<strong>Q1：为什么量化需要 STE？</strong>
</h2><p>量化函数是分段常数：</p>
        
        <hr><p>Published on 2022-06-09 at <a href='https://3bfd66bd.hugo-blog-asi.pages.dev/'>🌸沐雪-medicago🎀</a>, last modified on 2022-06-09</p>]]>
      </description>
      
        <category>test</category>
      
    </item>
    
  </channel>
</rss>
