API Reference
Solvers
sawmil.quadprog
quadprog
quadprog(
H: NDArray[float64],
f: NDArray[float64],
Aeq: Optional[NDArray[float64]],
beq: Optional[NDArray[float64]],
lb: NDArray[float64],
ub: NDArray[float64],
solver: str = "gurobi",
verbose: bool = False,
solver_params: Optional[Mapping[str, Any]] = None,
) -> Union[
Tuple[npt.NDArray[np.float64], "Objective"], None
]
Solve the quadratic program:
minimize 0.5 * αᵀ H α + fᵀ α
subject to Aeq α = beq
lb ≤ α ≤ ub
Parameters:
-
H
(NDArray[float64]
) –(n, n) quadratic term matrix in 0.5 * αᵀ H α
-
f
(NDArray[float64]
) –(n,) linear term vector in fᵀ α , usually f = -1
-
Aeq
(Optional[NDArray[float64]]
) –(m, n) equality constraint matrix, usually yᵀ
-
beq
(Optional[NDArray[float64]]
) –(m,) equality constraint rhs, usually 0
-
lb
(NDArray[float64]
) –(n,) lower bound vector, usually 0
-
ub
(NDArray[float64]
) –(n,) upper bound vector, usually C
-
verbose
(bool
, default:False
) –If True, print solver logs
-
solver_params
(Optional[Mapping[str, Any]]
, default:None
) –dict of backend-specific options. Examples: - solver='gurobi': {'env':
, 'params': {'Method':2, 'Threads':1}} - solver='osqp' : {'setup': {...}, 'solve': {...}} or flat keys for setup - solver='daqp' : {'eps_abs': 1e-8, 'eps_rel': 1e-8, ...}
Returns:
-
Union[Tuple[NDArray[float64], 'Objective'], None]
–α*: Optimal solution vector
-
Objective
(Union[Tuple[NDArray[float64], 'Objective'], None]
) –quadratic and linear parts of the optimum
Source code in src/sawmil/quadprog.py
106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 |
|
sawmil.solvers._gurobi
quadprog_gurobi
quadprog_gurobi(
H: NDArray[float64],
f: NDArray[float64],
Aeq: Optional[NDArray[float64]],
beq: Optional[NDArray[float64]],
lb: NDArray[float64],
ub: NDArray[float64],
verbose: bool = False,
**params: Any,
) -> Tuple[npt.NDArray[np.float64], "Objective"]
Solve the quadratic program using Gurobi:
minimize 0.5 * αᵀ H α + fᵀ α
subject to Aeq α = beq
lb ≤ α ≤ ub
Parameters:
-
H
(NDArray[float64]
) –(n, n) Hessian matrix for the quadratic term in 0.5 * αᵀ H α.
-
f
(NDArray[float64]
) –(n,) linear term vector in fᵀ α. For SVMs, usually -1 for each component.
-
Aeq
(Optional[NDArray[float64]]
) –(m, n) equality constraint matrix, usually yᵀ
-
beq
(Optional[NDArray[float64]]
) –(m,) equality constraint rhs, usually 0
-
lb
(NDArray[float64]
) –(n,) lower bound vector, usually 0
-
ub
(NDArray[float64]
) –(n,) upper bound vector, usually C
-
verbose
(bool
, default:False
) –If True, print solver logs
Returns:
-
NDArray[float64]
–α*: Optimal solution vector
-
Objective
('Objective'
) –quadratic and linear parts of the optimum
Source code in src/sawmil/solvers/_gurobi.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 |
|
sawmil.solvers._osqp
quadprog_osqp
quadprog_osqp(
H: NDArray[float64],
f: NDArray[float64],
Aeq: Optional[NDArray[float64]],
beq: Optional[NDArray[float64]],
lb: NDArray[float64],
ub: NDArray[float64],
verbose: bool = False,
**params: Any,
) -> Tuple[npt.NDArray[np.float64], "Objective"]
Solve the quadratic program using OSQP:
minimize 0.5 * αᵀ H α + fᵀ α
subject to Aeq α = beq
lb ≤ α ≤ ub
Parameters:
-
H
(NDArray[float64]
) –(n, n) Hessian matrix for the quadratic term in 0.5 * αᵀ H α. For SVM duals, this is typically (y yᵀ) ⊙ K where K is the kernel matrix.
-
f
(NDArray[float64]
) –(n,) linear term vector in fᵀ α. For SVMs, usually -1 for each component.
-
Aeq
(Optional[NDArray[float64]]
) –(m, n) optional equality constraint matrix, e.g. yᵀ for SVM bias constraint.
-
beq
(Optional[NDArray[float64]]
) –(m,) optional right-hand side of equality constraint, usually 0.
-
lb
(NDArray[float64]
) –(n,) lower bound vector, e.g. all zeros in standard SVM dual.
-
ub
(NDArray[float64]
) –(n,) upper bound vector, e.g. all entries equal to C in soft-margin SVM.
-
verbose
(bool
, default:False
) –If True, print solver logs
Returns:
-
NDArray[float64]
–α*: Optimal solution vector
-
Objective
('Objective'
) –quadratic and linear parts of the optimum
Source code in src/sawmil/solvers/_osqp.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 |
|
Kernels
sawmil.kernels
BaseKernel
Bases: ABC
Minimal kernel interface for the single-instance kernels: fit (optional) + call.
Normalize
dataclass
Normalize(k: BaseKernel, eps: float = 1e-12)
Polynomial
dataclass
Polynomial(
degree: int = 3,
gamma: Optional[float] = None,
coef0: float = 0.0,
)
Precomputed
dataclass
Precomputed(K: ndarray)
Bases: BaseKernel
Use when a Gram matrix is already built; ignores X,Y and returns K (shape checked by caller).
Product
dataclass
Product(k1: BaseKernel, k2: BaseKernel)
RBF
dataclass
RBF(gamma: Optional[float] = None)
Scale
dataclass
Scale(a: float, k: BaseKernel)
Sigmoid
dataclass
Sigmoid(gamma: Optional[float] = None, coef0: float = 0.0)
Sum
dataclass
Sum(k1: BaseKernel, k2: BaseKernel)
get_kernel
get_kernel(spec: KernelType, **kwargs) -> BaseKernel
Normalize various 'kernel=' inputs into a BaseKernel object. kwargs are used only when spec is a string (e.g., gamma, degree, coef0, K).
Source code in src/sawmil/kernels.py
162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 |
|
sawmil.bag_kernels
BaseBagKernel
Bases: ABC
Base class for the Multiple-instance (bags) kernel
WeightedMeanBagKernel
dataclass
WeightedMeanBagKernel(
inst_kernel: BaseKernel,
normalizer: _NormalizerName = "average",
p: float = 1.0,
use_intra_labels: bool = False,
fast_linear: bool = True,
)
Bases: BaseBagKernel
K(Bi,Bj) = [ (w_i^T k(Bi,Bj) w_j) ] ** p / ( norm(Bi) * norm(Bj) ), where w_i, w_j are instance weights normalized to sum=1 (fallback to uniform if a bag’s mask sums to 0).
(default is 'average')
- "none" -> norm(B)=1
- "average" -> norm(B)=sum(mask) (fallback to bag size)
- "featurespace" -> norm(B)=sqrt(w^T k(X,X) w) (fast for Linear via ||weighted_mean||)
By default (use_intra_labels=False): - w_i are UNIFORM weights (1/n_i), i.e., intra labels are IGNORED. - normalizer defaults to "none" (no extra scaling).
Bags
sawmil.bag
Bag
dataclass
Bag(
X: NDArray[float64],
y: Label,
intra_bag_label: Optional[NDArray[float64]] = None,
)
A bag of instances with a bag-level label and per-instance (0/1) flags.
d
property
d: int
Number of features.
mask
property
mask: NDArray[float64]
Intra-bag label mask.
n
property
n: int
Number of instances in the bag.
negatives
negatives() -> npt.NDArray[np.int64]
Indices of instances with intra_bag_label == 0.
Source code in src/sawmil/bag.py
53 54 55 |
|
positives
positives() -> npt.NDArray[np.int64]
Indices of instances with intra_bag_label == 1.
Source code in src/sawmil/bag.py
49 50 51 |
|
BagDataset
dataclass
BagDataset(bags: List[Bag])
A dataset of bags.
num_bags
property
num_bags: int
Returns the number of bags.
num_instances
property
num_instances: int
Returns the total number of instances.
num_neg_bags
property
num_neg_bags: int
Returns the number of negative bags.
num_neg_instances
property
num_neg_instances: int
Returns the number of negative instances.
num_pos_bags
property
num_pos_bags: int
Returns the number of positive bags.
num_pos_instances
property
num_pos_instances: int
Returns the number of positive instances.
y
property
y: ndarray
Returns all the bag labels.
from_arrays
staticmethod
from_arrays(
bags: Sequence[ndarray],
y: Sequence[float],
intra_bag_labels: Sequence[ndarray] | None = None,
) -> "BagDataset"
Create a :class:BagDataset
from raw numpy arrays.
Parameters
bags:
Sequence of arrays where each element contains the instances of a
bag with shape (n_i, d)
.
y:
Bag-level labels corresponding to each element of bags
.
intra_bag_labels:
Optional sequence of 1D arrays with per-instance 0/1
flags. If
omitted, all instances in a bag are considered positive.
Returns
BagDataset
Dataset composed of :class:Bag
objects built from the provided
arrays.
Source code in src/sawmil/bag.py
63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 |
|
negative_bags
negative_bags() -> list[Bag]
Returns all negative bags.
Source code in src/sawmil/bag.py
111 112 113 |
|
negative_bags_as_singletons
negative_bags_as_singletons() -> list[Bag]
Transforms all negative bags into singleton bags, by flattening each bag (b, n, d) -> (b x n, d)
Source code in src/sawmil/bag.py
162 163 164 165 166 167 168 169 170 |
|
negative_instances
negative_instances() -> tuple[np.ndarray, np.ndarray]
Returns instances from: - all negative bags (all instances), - plus from positive bags where intra_bag_label == 0. Returns: X_neg: (M, d) bag_index: (M,) indices into self.bags (original positions)
Source code in src/sawmil/bag.py
135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 |
|
positive_bags
positive_bags() -> list[Bag]
Returns all positive bags.
Source code in src/sawmil/bag.py
107 108 109 |
|
positive_bags_as_singletons
positive_bags_as_singletons() -> list[Bag]
Transforms all positive bags into singleton bags, by flattening each bag (b, n, d) -> (b x n, d)
Source code in src/sawmil/bag.py
172 173 174 175 176 177 178 179 180 |
|
positive_instances
positive_instances() -> tuple[np.ndarray, np.ndarray]
Returns instances from positive bags with intra_bag_label == 1. Returns: X_pos: (N, d) bag_index: (N,) indices into self.bags (original positions)
Source code in src/sawmil/bag.py
115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 |
|
Dummy Data
sawmil.data
generate_dummy_bags
generate_dummy_bags(
*,
n_pos: int = 100,
n_neg: int = 60,
inst_per_bag: Tuple[int, int] = (4, 12),
d: int = 2,
pos_centers: Sequence[Sequence[float]] = (
(+2.0, +1.0),
(+4.0, +3.0),
),
neg_centers: Sequence[Sequence[float]] = (
(-1.5, -1.0),
(-3.0, +0.5),
),
pos_scales: Sequence[Tuple[float, float]] = (
(2.0, 0.6),
(1.2, 0.8),
),
neg_scales: Sequence[Tuple[float, float]] = (
(1.5, 0.5),
(2.5, 0.9),
),
pos_intra_rate: Tuple[float, float] = (0.3, 0.8),
ensure_pos_in_every_pos_bag: bool = True,
neg_pos_noise_rate: Tuple[float, float] = (0.0, 0.05),
pos_neg_noise_rate: Tuple[float, float] = (0.0, 0.2),
outlier_rate: float = 0.01,
outlier_scale: float = 10.0,
random_state: int = 0,
) -> BagDataset
Generate a synthetic MIL dataset with mixed Gaussian components.
Returns:
-
BagDataset
(BagDataset
) –- Positive bags (y=1) mix pos-like instances (intra=1) and distractors (intra=0).
- Negative bags (y=0) may include a small fraction of pos-like contamination; intra labels default to ones in Bag, but the bag label remains 0.
Source code in src/sawmil/data/dummy.py
53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 |
|
load_musk_bags
load_musk_bags(
*,
version: int = 1,
test_size: float = 0.2,
random_state: int = 0,
standardize: bool = True,
) -> Tuple[BagDataset, BagDataset, StandardScaler | None]
Fetch Musk from OpenML, build BagDataset, stratified split by bag label. Args: version: 'musk' dataset version (default 1) test_size: proportion of the dataset to include in the test split (default 0.2) random_state: random seed for the sklearn package standardize: whether to standardize the features. The StandardScaler will be fit on the training data (default True) Returns: (train_ds, test_ds)
Source code in src/sawmil/data/musk.py
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 |
|